In Hypothesis Testing

In hypothesis testing, Statistical tests are used to check whether the
null hypothesis is rejected or
not rejected. These Statistical tests assume a null hypothesis of no relationship or no difference
between groups.
So, In this article, we will be discussing the statistical test for hypothesis testing including both
parametric and non-parametric tests.
Table of Contents
1. What are Parametric Tests?
2. What are Non-parametric Tests?
3. Parametric Tests for Hypothesis testing
 T-test
 Z-test
 F-test
 ANOVA
4. Non-parametric Tests for Hypothesis testing
 Chi-square
 Mann-Whitney U-test
 Kruskal-Wallis H-test
Let’s get started,
Parametric Tests
The basic principle behind the parametric tests is that we have a fixed set of parameters that are
used to determine a probabilistic model that may be used in Machine Learning as well.
Parametric tests are those tests for which we have prior knowledge of the population distribution
(i.e, normal), or if not then we can easily approximate it to a normal distribution which is
possible with the help of the Central Limit Theorem.
Parameters for using the normal distribution is –
 Mean
 Standard Deviation
Eventually, the classification of a test to be parametric is completely dependent on the population
assumptions. There are many parametric tests available from which some of them are as follows:
 To find the confidence interval for the population means with the help of known standard
deviation.
 To determine the confidence interval for population means along with the unknown
standard deviation.
 To find the confidence interval for the population variance.
 To find the confidence interval for the difference of two means, with an unknown value
of standard deviation.
Non-parametric Tests
In Non-Parametric tests, we don’t make any assumption about the parameters for the given
population or the population we are studying. In fact, these tests don’t depend on the population.
Hence, there is no fixed set of parameters is available, and also there is no distribution (normal
distribution, etc.) of any kind is available for use.
This is also the reason that nonparametric tests are also referred to as distribution-free tests.
In modern days, Non-parametric tests are gaining popularity and an impact of influence some
reasons behind this fame is –
 The main reason is that there is no need to be mannered while using parametric tests.
 The second reason is that we do not require to make assumptions about the population
given (or taken) on which we are doing the analysis.
 Most of the nonparametric tests available are very easy to apply and to understand also
i.e. the complexity is very low.
Image Source: Google Images

T-Test
1. It is a parametric test of hypothesis testing based on Student’s T distribution.
2. It is essentially, testing the significance of the difference of the mean values when the sample
size is small (i.e, less than 30) and when the population standard deviation is not available.
3. Assumptions of this test:
 Population distribution is normal, and

 Samples are random and independent
 The sample size is small.
 Population standard deviation is not known.
4. Mann-Whitney ‘U’ test is a non-parametric counterpart of the T-test.
A T-test can be a:
One Sample T-test: To compare a sample mean with that of the population mean.
where,
x̄ is the sample mean
s is the sample standard deviation
n is the sample size
μ is the population mean
Two-Sample T-test: To compare the means of two different samples.

where,
x̄1 is the sample mean of the first group
x̄2 is the sample mean of the second group
S1 is the sample-1 standard deviation
S2 is the sample-2 standard deviation
Conclusion:
 If the value of the test statistic is greater than the table value -> Rejects the null
hypothesis.
 If the value of the test statistic is less than the table value -> Do not reject the null
hypothesis.
Z-Test
1. It is a parametric test of hypothesis testing.
2. It is used to determine whether the means are different when the population variance is known
and the sample size is large (i.e, greater than 30).
 Population distribution is normal

 Samples are random and independent.
 The sample size is large.
 Population standard deviation is known.
A Z-test can be:
One Sample Z-test: To compare a sample mean with that of the population mean.
Two Sample Z-test: To compare the means of two different samples.
where,
x̄1 is the sample mean of 1st group
x̄2 is the sample mean of 2nd group
σ1 is the population-1 standard deviation
σ2 is the population-2 standard deviation
F-Test
1. It is a parametric test of hypothesis testing based on Snedecor F-distribution.
2. It is a test for the null hypothesis that two normal populations have the same variance.
3. An F-test is regarded as a comparison of equality of sample variances.
4. F-statistic is simply a ratio of two variances.

5. It is calculated as:
F = s12/s22
6. By changing the variance in the ratio, F-test has become a very flexible test. It can then be
used to:
 Test the overall significance for a regression model.

 To compare the fits of different models and
 To test the equality of means.

 Samples are drawn randomly and independently.
ANOVA
1. Also called as Analysis of variance, it is a parametric test of hypothesis testing.
2. It is an extension of the T-Test and Z-test.
3. It is used to test the significance of the differences in the mean values among more than two
sample groups.
4. It uses F-test to statistically test the equality of means and the relative variance between them.

 Samples are random and independent.
 Homogeneity of sample variance.
6. One-way ANOVA and Two-way ANOVA are is types.
7. F-statistic = variance between the sample means/variance within the sample
Chi-Square Test
1. It is a non-parametric test of hypothesis testing.
2. As a non-parametric test, chi-square can be used:
 test of goodness of fit.

 as a test of independence of two variables.
3. It helps in assessing the goodness of fit between a set of observed and those expected
theoretically.
4. It makes a comparison between the expected frequencies and the observed frequencies.
5. Greater the difference, the greater is the value of chi-square.
6. If there is no difference between the expected and observed frequencies, then the value of chi-
square is equal to zero.
7. It is also known as the “Goodness of fit test” which determines whether a particular
distribution fits the observed data or not.

9. Chi-square is also used to test the independence of two variables.
10. Conditions for chi-square test:
 Randomly collect and record the Observations.

 In the sample, all the entities must be independent.
 No one of the groups should contain very few items, say less than 10.
 The reasonably large overall number of items. Normally, it should be at least 50, however
small the number of groups may be.
11. Chi-square as a parametric test is used as a test for population variance based on sample
variance.
12. If we take each one of a collection of sample variances, divide them by the known population
variance and multiply these quotients by (n-1), where n means the number of items in the
sample, we get the values of chi-square.

Mann-Whitney U-Test
2. This test is used to investigate whether two independent samples were selected from a
population having the same distribution.
3. It is a true non-parametric counterpart of the T-test and gives the most accurate estimates of
significance especially when sample sizes are small and the population is not normally
distributed.
4. It is based on the comparison of every observation in the first sample with every observation
in the other sample.
5. The test statistic used here is “U”.

6. Maximum value of “U” is ‘n1*n2‘ and the minimum value is zero.
7. It is also known as:
 Mann-Whitney Wilcoxon Test.

 Mann-Whitney Wilcoxon Rank Test.
8. Mathematically, U is given by:
U1 = R1 – n1(n1+1)/2
where n1 is the sample size for sample 1, and R1 is the sum of ranks in Sample 1.
U2 = R2 – n2(n2+1)/2
When consulting the significance tables, the smaller values of U1 and U2 are used. The sum of
two values is given by,
U1 + U2 = { R1 – n1(n1+1)/2 } + { R2 – n2(n2+1)/2 }
Knowing that R1+R2 = N(N+1)/2 and N=n1+n2, and doing some algebra, we find that the sum is:
U1 + U2 = n1*n2
Kruskal-Wallis H-test
2. This test is used for comparing two or more independent samples of equal or different sample
sizes.
3. It extends the Mann-Whitney-U-Test which is used to comparing only two groups.
4. One-Way ANOVA is the parametric equivalent of this test. And that’s why it is also known as
‘One-Way ANOVA on ranks.
5. It uses ranks instead of actual data.
6. It does not assume the population to be normally distributed.
7. The test statistic used here is “H”.
What is ANOVA?
Developed by Ronald Fisher, ANOVA stands for Analysis of Variance. One-Way Analysis of
Variance tells you if there are any statistical differences between the means of three or more
independent groups.
When might you use ANOVA?
You might use Analysis of Variance (ANOVA) as a marketer, when you want to test a particular
hypothesis. You would use ANOVA to help you understand how your different groups respond,
with a null hypothesis for the test that the means of the different groups are equal. If there is a
statistically significant result, then it means that the two populations are unequal (or different).
How can ANOVA help?
The one-way ANOVA can help you know whether or not there are significant differences
between the means of your independent variables (such as the first example: age, sex, income).
When you understand how each independent variable’s mean is different from the others, you
can begin to understand which of them has a connection to your dependent variable (landing
page clicks), and begin to learn what is driving that behavior.
Examples of using ANOVA
You may want to use ANOVA to help you answer questions like this:
Do age, sex, or income have an effect on whether someone clicks on a landing page?
Do location, employment status, or education have an effect on NPS score?
One-way ANOVA can help you know whether or not there are significant differences between
the groups of your independent variables (such as USA vs Canada vs Mexico when testing a
Location variable). You may want to test multiple independent variables (such as Location,
employment status or education). When you understand how the groups within the independent
variable differ (such as USA vs Canada vs Mexico, not location, employment status, or
education), you can begin to understand which of them has a connection to your dependent
variable (NPS score).
“Do all your locations have the same average NPS score?”
Although, you should note that ANOVA will only tell you that the average NPS scores across all
locations are the same or are not the same, it does not tell you which location has a significantly
higher or lower average NPS score.
What is the difference between one-way and two-way ANOVA tests?
This is defined by how many independent variables are included in the ANOVA test. One-way
means the analysis of variance has one independent variable. Two-way means the test has two
independent variables. An example of this may be the independent variable being a brand of
drink (one-way), or independent variables of brand of drink and how many calories it has or
whether it’s original or diet.
How does ANOVA work?
Like other types of statistical tests, ANOVA compares the means of different groups and shows
you if there are any statistical differences between the means. ANOVA is classified as an
omnibus test statistic. This means that it can’t tell you which specific groups were statistically
significantly different from each other, only that at least two of the groups were.
It’s important to remember that the main ANOVA research question is whether the sample
means are from different populations. There are two assumptions upon which ANOVA rests:
First: Whatever the technique of data collection, the observations within each sampled
population are normally distributed.
Second: The sampled population has a common variance of s2.
How to conduct an ANOVA test
Stats iQ and ANOVA

Stats iQ from Qualtrics can help you run an ANOVA test. When users select one categorical
variable with three or more groups and one continuous or discrete variable, Stats iQ runs a one-
way ANOVA (Welch’s F test) and a series of pairwise “post hoc” tests (Games-Howell tests).
The one-way ANOVA tests for an overall relationship between the two variables, and the
pairwise tests test each possible pair of groups to see if one group tends to have higher values
than the other.
Users can run an ANOVA test through StatsiQ
The Overall Stat Test of Averages acts as an Analysis of Variance (ANOVA). An ANOVA tests
the relationship between a categorical and a numeric variable by testing the differences between
two or more means. This test produces a p-value to determine whether the relationship is
significant or not.
In StatsiQ take the following steps:
 Click a variable with 3+ groups and one with numbers,

 Then click “Relate”,
 You’ll then get an ANOVA, a related “effect size”, and a simple, easy to understand
summary.
Qualtrics Crosstabs and ANOVA
You can run an ANOVA test through the Qualtrics Crosstabs feature too.
 Ensure your “banner” (column) variable has 3+ groups and your “stub” (rows) variable
has numbers (like Age) or numeric recodes (like “Very Satisfied” = 7)
 Click “Overall stat test of averages”
 You’ll see a basic ANOVA p-value
What does an ANOVA test reveal?
A one way ANOVA will allow you to distinguish that at least two groups were different from
each other. Once you begin to understand the difference between the independent variables you
will then be able to see how each behaves with your dependent variable. (See landing page
example above)
What are the limitations of ANOVA?

Whilst ANOVA will help you to analyse the difference in means between two independent
variables, it won’t tell you which statistical groups were different from each other. If your test
returns a significant f-statistic (this is the value you get when you run an ANOVA test), you may
need to run an ad hoc test (like the Least Significant Difference test) to tell you exactly which
groups had a difference in means.
An example to understand this can be prescribing medicines.

 Suppose, there is a group of patients who are suffering from fever.

 They are being given three different medicines that have the same
functionality i.e. to cure fever.

 To understand the effectiveness of each medicine and choose the best
among them, the ANOVA test is used.
An example to understand this can be prescribing medicines.

 Suppose, there is a group of patients who are suffering from fever.

 They are being given three different medicines that have the same
functionality i.e. to cure fever.

 To understand the effectiveness of each medicine and choose the best
among them, the ANOVA test is used.
One-Way ANOVA

One-way ANOVA is generally the most used method of performing the ANOVA
test. It is also referred to as one-factor ANOVA, between-subjects ANOVA, and
an independent factor ANOVA. It is used to compare the means of two
independent groups using the F-distribution.

Two carry out the one-way ANOVA test, you should necessarily have only one
independent variable with at least two levels. One-way ANOVA does not differ
much from t-test.

Example where one-way ANOVA is used

Suppose a teacher wants to know how good he has been in teaching with the
students. So, he can split the students of the class into different groups and
assign different projects related to the topics taught to them.

He can use one-way ANOVA to compare the average score of each group. He
can get a rough understanding of topics to teach again. However, he won’t be
able to identify the student who could not understand the topic.

Two-way ANOVA

Two-way ANOVA is carried out when you have two independent variables. It is
an extension of one-way ANOVA. You can use the two-way ANOVA test when
your experiment has a quantitative outcome and there are two independent
variables.

Two-way ANOVA is performed in two ways:

1. Two-way ANOVA with replication: It is performed when there are two
groups and the members of these groups are doing more than one
thing. Our example in the beginning can be a good example of two-way
ANOVA with replication.

2. Two-way ANOVA without replication: This is used when you have only
one group but you are double-testing that group. For example, a patient
is being observed before and after medication.

Assumptions for Two-way ANOVA

 The population must be close to a normal distribution.
 Samples must be independent.
 Population variances must be equal.
 Groups must have equal sample sizes.
What is a Chi-square test?
A Chi-square test is a hypothesis testing method. Two common Chi-square tests involve
checking if observed frequencies in one or more categories match expected frequencies.
Is a Chi-square test the same as a χ² test?

Yes, χ is the Greek symbol Chi.
What are my choices?

If you have a single measurement variable, you use a Chi-square goodness of fit test. If you have
two measurement variables, you use a Chi-square test of independence. There are other Chi-
square tests, but these two are the most common.
Types of Chi-square tests
You use a Chi-square test for hypothesis tests about whether your data is as expected. The basic
idea behind the test is to compare the observed values in your data to the expected values that
you would see if the null hypothesis is true.
There are two commonly used Chi-square tests: the Chi-square goodness of fit test and the Chi-
square test of independence. Both tests involve variables that divide your data into categories. As
a result, people can be confused about which test to use. The table below compares the two tests.
Visit the individual pages for each type of Chi-square test to see examples along with details on
assumptions and calculations.
Table 1: Choosing a Chi-square test
Chi-Square Goodness of Chi-Square Test of Independence

Fit Test
Number of variables One Two
Decide if one variable is

Decide if two variables might be related
Purpose of test likely to come from a
or not
given distribution or not
Decide if bags of candy

Decide if movie goers' decision to buy
have the same number of
Example snacks is related to the type of movie
pieces of each flavor or
they plan to watch
not
Ho: proportion of people who buy

Ho: proportion of flavors
snacks is independent of the movie type
of candy are the same
Hypotheses in
example Ha: proportion of people who buy
Ha: proportions of flavors
snacks is different for different types of
are not the same
movies
Theoretical
distribution used in Chi-Square Chi-Square
test
Number of categories for first variable
Number of categories minus 1, multiplied by number of
minus 1 categories for second variable minus 1
Degrees of freedom
 In our example,  In our example, number of
number of flavors of movie categories minus 1, multiplied by
candy minus 1 1 (because snack purchase is a Yes/No
variable and 2-1 = 1)
How to perform a Chi-square test
For both the Chi-square goodness of fit test and the Chi-square test of independence, you
perform the same analysis steps, listed below. Visit the pages for each type of test to see these
steps in action.
1. Define your null and alternative hypotheses before collecting your data.
2. Decide on the alpha value. This involves deciding the risk you are willing to take of
drawing the wrong conclusion. For example, suppose you set α=0.05 when testing for
independence. Here, you have decided on a 5% risk of concluding the two variables are
independent when in reality they are not.
3. Check the data for errors.
4. Check the assumptions for the test. (Visit the pages for each test type for more detail on
assumptions.)
5. Perform the test and draw your conclusion.
Both Chi-square tests in the table above involve calculating a test statistic. The basic idea behind
the tests is that you compare the actual data values with what would be expected if the null
hypothesis is true. The test statistic involves finding the squared difference between actual and
expected data values, and dividing that difference by the expected data values. You do this for
each data point and add up the values.
Independence
When considering student sex and course choice, a χ2 test for independence could be used. To do
this test, the researcher would collect data on the two chosen variables (sex and courses picked)
and then compare the frequencies at which male and female students select among the offered
classes using the formula given above and a χ2 statistical table.
If there is no relationship between sex and course selection (that is, if they are independent), then
the actual frequencies at which male and female students select each offered course should be
expected to be approximately equal, or conversely, the proportion of male and female students in
any selected course should be approximately equal to the proportion of male and female students
in the sample. A χ2 test for independence can tell us how likely it is that random chance can
explain any observed difference between the actual frequencies in the data and these theoretical
expectations.
Goodness-of-Fit
χ2 provides a way to test how well a sample of data matches the (known or assumed)
characteristics of the larger population that the sample is intended to represent. If the sample data
do not fit the expected properties of the population that we are interested in, then we would not
want to use this sample to draw conclusions about the larger population.
For example consider an imaginary coin with exactly 50/50 chance of landing heads or tails and
a real coin that you toss 100 times. If this real coin has an is fair, then it will also have an equal
probability of landing on either side, and the expected result of tossing the coin 100 times is that
heads will come up 50 times and tails will come up 50 times. In this case, χ2 can tell us how well
the actual results of 100 coin flips compare to the theoretical model that a fair coin will give
50/50 results. The actual toss could come up 50/50, or 60/40, or even 90/10. The farther away the
actual results of the 100 tosses is from 50/50, the less good the fit of this set of tosses is to the
theoretical expectation of 50/50 and the more likely we might conclude that this coin is not
actually a fair coin.

In Hypothesis Testing

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

In Hypothesis Testing

Uploaded by

Copyright:

Available Formats

In hypothesis testing, Statistical tests are used to check whether the

null hypothesis is rejected or

not rejected. These Statistical tests assume a null hypothesis of no relationship or no difference

parametric and non-parametric tests.

1. What are Parametric Tests?

2. What are Non-parametric Tests?

3. Parametric Tests for Hypothesis testing

4. Non-parametric Tests for Hypothesis testing

Let’s get started,

possible with the help of the Central Limit Theorem.

Parameters for using the normal distribution is –

Eventually, the classification of a test to be parametric is completely dependent on the population

distribution, etc.) of any kind is available for use.

reasons behind this fame is –

Image Source: Google Images

1. It is a parametric test of hypothesis testing based on Student’s T distribution.

 Population distribution is normal, and

4. Mann-Whitney ‘U’ test is a non-parametric counterpart of the T-test.

x̄ is the sample mean

s is the sample standard deviation

n is the sample size

μ is the population mean

Two-Sample T-test: To compare the means of two different samples.

x̄1 is the sample mean of the first group

x̄2 is the sample mean of the second group

S1 is the sample-1 standard deviation

S2 is the sample-2 standard deviation

n is the sample size

1. It is a parametric test of hypothesis testing.

and the sample size is large (i.e, greater than 30).

3. Assumptions of this test:

 Population distribution is normal

A Z-test can be:

Two Sample Z-test: To compare the means of two different samples.

x̄1 is the sample mean of 1st group

x̄2 is the sample mean of 2nd group

σ1 is the population-1 standard deviation

σ2 is the population-2 standard deviation

n is the sample size

1. It is a parametric test of hypothesis testing based on Snedecor F-distribution.

3. An F-test is regarded as a comparison of equality of sample variances.

4. F-statistic is simply a ratio of two variances.

 Test the overall significance for a regression model.

7. Assumptions of this test:

 Population distribution is normal, and

1. Also called as Analysis of variance, it is a parametric test of hypothesis testing.

2. It is an extension of the T-Test and Z-test.

 Population distribution is normal, and

6. One-way ANOVA and Two-way ANOVA are is types.

7. F-statistic = variance between the sample means/variance within the sample

1. It is a non-parametric test of hypothesis testing.

2. As a non-parametric test, chi-square can be used:

 test of goodness of fit.

5. Greater the difference, the greater is the value of chi-square.

square is equal to zero.

7. It is also known as the “Goodness of fit test” which determines whether a particular

distribution fits the observed data or not.

8. It is calculated as:

10. Conditions for chi-square test:

 Randomly collect and record the Observations.

sample, we get the values of chi-square.

13. It is calculated as:

1. It is a non-parametric test of hypothesis testing.