Professional Documents
Culture Documents
DAFM - Unit 6 - RM - Hypothesis Testing
DAFM - Unit 6 - RM - Hypothesis Testing
Unit 6
Hypothesis Testing
Unit 6 - Hypothesis Testing
Learning Objectives
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 2
Unit 6 - Hypothesis Testing
Table of contents
S.No Details Page No.
1. Introduction 4
2. Hypothesis 5
2.1 Hypothesis Testing 5
2.2 Null and Alternative Hypothesis 5
3. One-tailed and Two-tailed Tests 7
4. Type I and Type II Errors 8
5. Level of Significance 10
6. p-value 13
6.1 How small is a ‘small’ p-value? 14
7. Test Statistic 14
7.1 Power of a Test 14
8. Chi-Square Test 15
8.1 How it is Calculated? 15
8.2 Application of χ2 Test 19
9. Chapter Summary 22
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 3
Unit 6 - Hypothesis Testing
1. Introduction
Decision-making involves identifying alternatives and assessing the pros and cons before arriving at a
decision. In order to arrive at a decision, we have certain assumptions that we need to test. These
assumptions are called hypotheses. To test our assumption statistically, we have to collect data.
This data is usually a sample obtained from a larger population. We have to test it to determine
whether there is a difference between the hypothesised value and the actual value obtained (usually a
test statistic).
We judge the significance using tools such as chi-square test. These tools can be used in identifying
associations between variables and they can be used in decision making. For instance, you may wish
to know the association between gender and the preference for mutual funds.
Running a chi-square test will help in determining the association.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 4
Unit 6 - Hypothesis Testing
2. Hypothesis
The assumption that the researcher holds about some characteristic of the population parameter is called a ‘hypothesis’.
The null hypothesis as the word refers to the hypothesis that the experimenter seeks to nullify. It is usually
Null hypothesis
represented as H0.
The alternate hypothesis is the hypothesis that the researcher’s stated assumption. In other words, it is the
Alternative hypothesis
assumption that the researcher wants to prove. It is usually represented by Ha or H1.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 5
Unit 6 - Hypothesis Testing
Example:
For instance, an XYZ bank manager might claim that the average default rate for outstanding credit card payment is less than 20% in his branch at M.G.
Road, XYZ bank. Based on the data collected for one month, it is possible to calculate the proportion of default from the sampling experiment. Depending
on the results obtained, the manager can then decide whether or not to reject the claim.
Before we can test the particular theory or hypothesis though, we need to understand the basic concepts involved in hypothesis testing. The manager’s
assumption of it being less than 20% is the hypothesis that we are testing and it is called the alternate hypothesis.
Ha = µ < 20%
The other two possibilities apart from the manager’s assumption are greater than or equal to 20 %, and this would be the null hypothesis, which is to be
rejected with sufficient statistical evidence to accept the claim of the Manager (i.e., the alternate hypothesis).
H0 = µ ≥ 20%
In this case, if there is sufficient evidence to reject the null (i.e., H0 = µ ≥ 20%), then the manager’s clain can be accepted.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 6
Unit 6 - Hypothesis Testing
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 7
Unit 6 - Hypothesis Testing
𝛼 𝛼/2 𝛼/2
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 8
Unit 6 - Hypothesis Testing
H0 is true H1 is true
Reject H0 Type I error No error
Decision
Do not reject H0 No error Type II error
5. Level of Significance
The level of significance signifies the strength of evidence should be present to reject the null hypothesis. It is the rejection region in the bell curve, the level
of significance should be specified by the researcher before conducting the study.
If the level of significance is 5% then the researcher should hold sufficient evidence against the null hypothesis so that the null hypothesis claim lies in the
rejection region (I.e level of significance) beyond the confidence interval of 95%.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 10
Unit 6 - Hypothesis Testing
For example, you could test the hypothesis that the customers who
complain are likely to close their accounts. It can happen that a large
number of customers who closed their accounts did indeed complain
before closing their account.
It can also be the case that they could have closed the account for other
reasons. It is safe to assume that a small margin of customers would
have closed their accounts for other reasons.
Frequency
Therefore, we cannot assume all customers who raised the complaint will
close their account, and at the same time we cannot even reject the
claim without testing it.
This level of significance signifies what percentage of evidence is
essential to accept the claim. Accept Null
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 11
Unit 6 - Hypothesis Testing
Example: The bank manager claims that the average balance maintained by customers is more than Rs 190000. A study was conducted by collecting 100
account balances of customers and the average balance was Rs 210000. Assume that the standard deviation of the claim is Rs 30000. Check whether the
claim is acceptable at 95% confidence level.
Solution:
H0 – The average balances maintained by customers is less than or equal to Rs 190000.
H1 – The average balance maintained by the customers is more than Rs 190000.
210000−190000
𝑍𝑐𝑎𝑙 = 30000 =4
100
Table value at 95% confidence level for right tail test 1.645.
Z cal > Ztab hence null hypothesis stands rejected.
Therefore the alternate hypothesis will be accepted. The average
balance maintained by customers is more than 190000.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 12
Unit 6 - Hypothesis Testing
6. p-value
The p-value is used to determine the significance value of the output. A small p-value means that the evidence against the null hypothesis is very strong.
We will reject the null hypothesis if the p-value is very small.
It is the probability value inside the rejection region of the bell curve.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 13
Unit 6 - Hypothesis Testing
• The p-value is less than 0.01, it is clear proof that we can accept the alternate hypothesis.
• The p-value is between 0.01 and 0.05, shows that the evidence against the null is strong.
• The p-value is between 0.05 and 0.10, does not provide clear proof and is therefore in the grey area.
• The p-values are greater than 0.10 are interpreted as weak evidence against the null and therefore is not reliable evidence to support the alternate.
7. Test Statistic
To accept or reject a null hypothesis, we obtain a test statistic from the data which is then compared with the original hypothesis. If the test statistic is similar
to the original hypothesis it will be accepted.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 14
Unit 6 - Hypothesis Testing
Example
As an example, suppose that you were the branch manager for an XYZ bank that you have introduced a new investment product, and your bank was
making claims that the product was successful. In the survey to test this claim, the null hypothesis would be that the investment product was successful.
If we were to commit a Type II error during these tests, we would be failing to reject a null hypothesis that was actually false. In other words, we would be
saying that the investment product was successful when in fact it was not successful. In this case, we would be committing a serious error.
In this case, we would like to minimise the probability of making a Type II error. To do that, we would like to increase the power of the test. One popular
method of increasing the power of the test is to increase the sample size.
8. Chi-Square Test
The chi-square test is used to explore the association between variables. We may, for example, wish to know whether there is an association between
gender and the likelihood of taking up mutual funds. This can be assessed using a chi-square test.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 15
Unit 6 - Hypothesis Testing
Consider the Employee Appraisal System of your bank. It has been taking place once a year. Now there is a proposal to have two appraisals each year.
Employees from each of the four zones were asked to state if they were in favour of the change. The responses are recorded in the table below.
A table like this is called a contingency table. This table has a dimension of 2 rows by 4 columns. So, this is called a 2 x 4 contingency table.
The bank management wishes to know if the proportion of respondents who prefer the existing system is the same for all the regions.
We can use the χ2 test (pronounced kai square) to explore this question.
Degrees of Freedom
The degrees of freedom are calculated using the rows and columns of the contingency table.
Formula: (R-1)* (C-1) where R stands for the number of rows and C stands for the number of columns.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 16
Unit 6 - Hypothesis Testing
Conditions
1. The frequencies used in the chi-square test must be absolute and not in relative terms.
2. The total number of observations collected for this test must be large.
3. Each of the observations which make up the sample of this test must be independent of each other.
4. As χ2 test is based wholly on sample data, no assumption is made concerning the population distribution. In other words, it is a non-parametric-test.
6. The expected frequency of any item or cell must not be less than 5, the frequencies of adjacent items or cells should be polled together in order to
make it more than 5.
7. The data should be expressed in original units for the convenience of comparison and the given distribution should not be replaced by relative
frequencies or proportions.
This test is used only for drawing inferences through a test of the hypothesis, so it cannot be used for estimation of a parameter value.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 17
Unit 6 - Hypothesis Testing
Interpretation:
After ascertaining the χ2 value, look at the χ2 table to find the acceptance region value.
The table has columns headed with symbols like 0.05 for 5% level of significance. The
first row indicates the degrees of freedom. The cells define the value that separates the
acceptance and rejection regions.
If the calculated value of χ2 falls in the acceptance region, the null hypothesis H0 is
accepted and otherwise, H0 is not accepted.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 18
Unit 6 - Hypothesis Testing
Salesmen
1 2 3 Total
I 5 15 20 40
Territories II 10 20 20 50
III 15 25 20 60
Total 30 60 60 150
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 19
Unit 6 - Hypothesis Testing
Solution
5 40x30/150 = 8 9 1.1250
10 50x30/150 = 10 0 0.0000
15 60x30/150 = 12 9 0.7500
15 40x60/150 = 16 1 0.0625
20 50x60/150 = 20 0 0.0000
25 60x60/150 = 24 1 0.0417
20 40x60/150 = 16 16 1.0000
20 50x60/150 = 20 0 0.0000
20 60x60/150 = 24 16 0.6667
3.6458
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 20
Unit 6 - Hypothesis Testing
df (3 – 1) x (3 – 1) = 4
4. Test:
χ2cal = 3.6458
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 21
Unit 6 - Hypothesis Testing
9. Chapter Summary
Here are the key points discussed in this unit.
• The assumption that the researcher holds about some characteristic of the population parameter is called a ‘hypothesis’.
• The objective of testing a hypothesis is to assess whether there is sufficient statistical evidence to confirm the hypothesis.
• The null hypothesis as the word refers to the hypothesis that the experimenter seeks to nullify. It is usually represented as H0.
• The alternate hypothesis is the hypothesis that the researcher’s stated assumption. In other words, it is the assumption that the
researcher wants to prove. It is usually represented by Ha or H1.
• When the researcher is looking to prove it in one direction then it is called the one-tailed test. If the researcher is not interested in
the direction per se then it is a two-tailed test.
• The possibility of rejecting the null that is true – Type I error.
• The mistake of accepting a null that is false – Type II error.
• The level of significance is the chance of rejecting a null hypothesis that is actually true. It is usually represented by the Greek
alpha α.
• The p-value is used to determine the significance value of the output. A small p-value means that the evidence against the null
hypothesis is very strong.
• The chi-square test is used to explore the association between variables.
© Copyright 2017, all rights reserved. Manipal Global Education Services Pvt. Ltd. 22