You are on page 1of 16

Hypothesis Testing

What is Hypothesis Testing?

• Sample information can be used to obtain point estimates or


confidence intervals about population parameters

• Alternatively, sample information can be used to test the validity


of conjectures about these parameters

– Are private banks more profitable than state-owned banks in the


EU countries?
– Are returns on a stock different before and after a stock split?
– Is there a larger variability in real estate prices in Champaign than
in Urbana?

1
What is Hypothesis Testing?

• A hypothesis is a statement about a population parameter from


one or more populations

• Statistically testable hypotheses are formulated based on


theories that are used to make predictions

• A hypothesis test is a procedure that


– States the hypothesis to be tested
– Uses sample information and formulates a decision rule
– Based on the outcome of the decision rule the hypothesis is
statistically validated or rejected

Steps in Hypothesis Testing

• The following steps are followed in a hypothesis test

– State the hypothesis


– Identify the appropriate test statistics and its probability distribution
– Specify the significance level
– State the decision rule
– Collect the data and calculate the test statistic
– Make the statistical decision
– Evaluate whether the statistical decision implies a corresponding
financial decision

2
Stating the Testable Hypotheses

• A hypothesis test always includes two hypotheses

– Null Hypothesis (H0): The null hypothesis is the hypothesis to be


tested
• E.g., The average debt-equity ratio for US industrial firms is 20%

– Alternative Hypothesis (H1): The alternative hypothesis is the one


accepted if the null hypothesis is rejected
• E.g., The average debt-equity ratio for US industrial firms is different
than 20%

Stating the Testable Hypotheses

• Note: The null hypothesis is a statement that is considered true


unless the sample used in the hypothesis testing provides
evidence that it is false

• Hypothesis tests for a population parameter θ in relation to a


possible value θ0 can be formulated as follows

– H0: θ = θ0 vs. H1: θ ≠ θ0


– H0: θ ≤ θ0 vs. H1: θ > θ0
– H0: θ ≥ θ0 vs. H1: θ < θ0

3
Stating the Testable Hypotheses

• The first formulation is a two-sided test while the other two are
one-sided tests

• In each formulation the null and the alternative account for all
possible values of the population parameter

• Regardless of the formulation, the test is always conducted at


the point of equality, θ = θ0

Stating the Testable Hypotheses

• How do we state the null and alternative hypotheses?

• Example: Suppose that theory tells us that growth funds


outperform value funds

– H0: Growth funds perform worse or equal to value funds


– H1: Growth funds perform better than value funds

• We formulate the alternative hypothesis as the statement that


the condition is true and test the validity of the null that the
statement is false

4
Identifying the Test Statistic and its
Probability Distribution

• The decision rule for the hypothesis test is based on a test


statistic

• The test statistic is a quantity calculated from sample


information that typically has the following form

(Sample Statistic – Value of Parameter under H0)/St. Error of


Sample Statistic

Identifying the Test Statistic and its


Probability Distribution

• Example: Suppose that we want to test the null hypothesis that


the mean return on the S&P 500 index during the past five years
is less or equal than 10% vs. the alternative that it is greater

• Drawing a sample and calculating the sample mean, we know

– If population distribution is normal with known variance, sample


mean follows normal distribution and we use the standardized
variable Z as our test statistic

5
Identifying the Test Statistic and its
Probability Distribution

– If in the above case, the population variance is unknown, but the


sample is large, we again use Z as our test statistic
– If the population variance is unknown or sample size is small, we
use the variable t as out test statistic

• If, for example, the variance of S&P 500 returns is unknown, we


will use the variable t, known as the t-statistic

X −µ
t n −1 =
s/ n

Specifying the Significance Level

• To reject or not the null hypothesis, the t-statistic is compared to


a pre-specified value

• The selected value is based on a pre-determined level of


significance

• Note that the null hypothesis can be either true or false

• However, there are four possible outcomes when a hypothesis


is tested

6
Specifying the Significance Level

• A false null hypothesis is rejected, which is a correct decision

• A true null hypothesis is rejected (this is called a Type I error)

• A false null hypothesis is not rejected (this is called a Type II


error)

• A true null hypothesis is not rejected, which is again a correct


decision

Specifying the Significance Level

• The probability of Type I error in a hypothesis test is called the


level of significance of the test

• Conducting a hypothesis test, we want the chance of type I error


to be as low as possible

• E.g., A level of significance of 5% implies a 5% chance of type I


error

• Note: As we decrease the chance of a type I error, we increase


the chance of a type II error

7
Specifying the Significance Level

• Lowering the chance of type I error implies that the null will be
rejected less often, including when it is false (type II error)

• To lower the probabilities of both errors we need to increase the


sample size

• The power of a test is the probability of correctly rejecting a false


null hypothesis (The power of a test is 1 – P(type II error))

• Conventional significant levels when testing hypotheses are:


10%, 5%, 1%

Specifying the Significance Level

• Example

– If we reject the null hypothesis at the 10% significance level, we


have some evidence that the alternative is true

– If we reject the null hypothesis at the 5% significance level, we


have strong evidence that the alternative is true

– If we reject the null hypothesis at the 1% significance level, we


have very strong evidence that the alternative is true

8
Stating the Decision Rule

• The decision rule compares the calculated test statistic with


specific cutoffs from the tables of the statistic’s distribution

• Example: Suppose that the test statistic that we use is the Z-


statistic (Z variable) and that we use a 5% significance level

• If the hypothesis test is H0: θ = θ0 vs. H1: θ ≠ θ0 then the two


rejection values are Z0.025 =1.96 and - Z0.025 = -1.96

• We would reject the null if Z < -1.96 or Z > 1.96

Collecting Data, Calculating Test Statistic


and Making a Decision

• In collecting a sample, it is important to avoid problems of


sample selection bias, such as survivorship bias

• Example: If we want to test a hypothesis regarding bank


performance and we choose in our sample only the banks that
exist in the last quarter, we do not include the banks that have
failed

• Banks still in existence must have performed better and, thus,


there will be some bias in our sample

9
Hypothesis Tests and Financial Decisions

• Deciding to reject or not the null hypothesis implies making a


statistical decision

• Does this always translate into a corresponding financial


decision?

• Example: Suppose we find support through a test for the


hypothesis that on average stocks provide higher returns than
bonds

Hypothesis Tests and Financial Decisions

• Does this statistical decision have a financial meaning, as well?

• From a financial or investment perspective we may also want to


understand what are the risks of investing in these two types of
assets

• Finally, we define the p-value as the smallest level of


significance at which we can reject the null hypothesis

10
Hypothesis Test for a Single Mean
(Normal Distribution, Variance Unknown)

Hypothesis Test Reject H0 if


(Significance level α)
H0: µ = µ0 or µ ≤ µ0 X − µ0
> t n −1,α
s/ n
H1: µ > µ0

H0: µ = µ0 or µ ≥ µ0
X − µ0
< −t n −1,α
H1: µ < µ0 s/ n

H0: µ = µ0 Either of the above two decision


rules holds
H1: µ ≠ µ0

Example of Hypothesis Test for a


Single Mean

• Suppose that the controller of a firm monitors the firm’s


payments from its customers through days receivables

• The firm has tried to maintain an average of 45 days in


receivables

• A recent random sample of 50 accounts has shown a mean of


49 days and a standard deviation of 8 days

• Can we reject the hypothesis that the average days in


receivables for this firm has increased?

11
Example of Hypothesis Test for a
Single Mean

• The testable hypotheses are stated as follows

H0: µ ≤ 45
H0: µ > 45

• The test can be conducted at the 5% and 1% levels of


significance

• Since the population variance is unknown, we use the t-statistic,


which is
49 − 45
t 49 = = 3.536
8 / 50

Example of Hypothesis Test for a


Single Mean

• The cutoffs for the t-distribution with 49 degrees of freedom at


the 5% and 1% level of significance are 1.677 and 2.405,
respectively

• Given that our t-statistic is greater than both cutoffs, the null
hypothesis is rejected both at the 5% and 1% levels

• This implies that there has been a statistically significant


increase in the days receivables for this firm

12
Hypothesis Test for Difference Between
Population Means

• We often want to test the hypothesis that the population means


differ between two groups

• Examples

– Is the average debt-equity ratio higher for mature compared to


young firms?
– Do average stock returns differ by decade?
– Do community banks on average lend more to small businesses
than larger banking institutions?
– Do average corporate defaults differ by industry?

Hypothesis Test for Difference Between


Population Means

• Taking samples from the two populations, we can formulate the


following hypotheses

– H0: µ1 = µ2 vs. H1: µ1 ≠ µ2


– H0: µ1 ≤ µ2 vs. H1: µ1 > µ2
– H0: µ1 ≥ µ2 vs. H1: µ1 < µ2

• Two cases (assuming samples are independent):


– Populations are assumed normally distributed, variances are
unknown, but equal
– Populations are assumed normally distributed, variances are
unknown, but unequal

13
Hypothesis Test for Difference Between
Population Means

• When population variances are assumed to be equal, the t-


statistic is as follows

t=
( X1 − X 2 ) − (µ1 − µ2 )
s 2p + s 2p
n1 + n2
where
(n1 − 1)s12 + (n2 − 1)s22
s 2p =
n1 + n2 − 2

and the degrees of freedom are n1 + n2 -2

Hypothesis Test for Difference Between


Population Means

• When population variances cannot be assumed to be equal, the


t-statistic is as follows

t=
( X1 − X 2 ) − (µ1 − µ2 )
s12 + s22
n1 + n2
and the degrees of freedom are
2
 s2 s2 
 1 + 2
 n1 n2 
df =  
(s12 / n1)2 + (s22 / n2 )2
n1 n2

14
Example of Hypothesis Test for Differences
Between Population Means

• Suppose that we observe monthly returns on the S&P 500 from


the 1970s and the 1980s (equal samples = 120 observations)

– For the 1970s, the mean monthly return is 0.58 and the standard
deviation is 4.598
– For the 1980s, the mean monthly return is 1.47 and the standard
deviation is 4.738

• We want to test whether the two population means are equal,


assuming that they are both normally distributed and that
variances are not known

Example of Hypothesis Test for Differences


Between Population Means

• The hypothesis test is formulated as follows

H0: µ70 = µ80 vs. H1: µ70 ≠ µ80

• Suppose we are interested in testing the above hypothesis at


the 5% and 1% levels of significance

• Assuming the two samples are independent, the degrees of


freedom are 238

15
Example of Hypothesis Test for Differences
Between Population Means

• Plugging the relevant information into the formulas for the


estimator of the common population variance, s2, and the t-
statistic, we find that t = -1.477

• The cutoff of the t-distribution for this two-sided test are


– At the 5% level, we reject the null if t < -1.972 or t > 1.972
– At the 1% level, we reject the null if t < -2.601 or t > 2.601

• Given our t-statistic of –1.477, we cannot reject the null


hypothesis at either the 5% or the 1% significance level

16

You might also like