You are on page 1of 33

1

INTRODUCTION
TO
HYPOTHESIS TESTING - III

Muhammad Afzal
Bio-statistician SZABMU
(M.Phil Statistics, MSc. Biostatistics, Diploma in project management)
STEPS TO UNDERTAKING A HYPOTHESIS TEST

Define study question

Set null and alternative hypothesis

Calculate a test statistic

Calculate a p-value

Make a decision and interpret


your conclusions
WHAT IS A TEST STATISTIC?

• A test statistic is a random variable that is calculated from sample data and used in a

hypothesis test.

• You can use test statistics to determine whether to reject the null hypothesis or not.

• The test statistic compares your data with what is expected under the null hypothesis.

• The test statistic is used to calculate the p-value.


WHAT IS A TEST STATISTIC?

• A test statistic measures the degree of agreement between a sample of data and the null
hypothesis.

• Its observed value changes randomly from one random sample to a different sample.

• A test statistic contains information about the data that is relevant for deciding whether
to reject the null hypothesis.
WHAT IS A TEST STATISTIC?

• The sampling distribution of the test statistic under the null hypothesis is called the null
distribution.

• When the data show strong evidence against the assumptions in the null hypothesis, the
magnitude of the test statistic becomes too large or too small depending on the alternate
hypothesis.

• This causes the test's p-value to become small enough to reject the null hypothesis.
Different hypothesis tests use different test statistics based
on the probability model assumed in the null hypothesis.

Common tests and their test statistics include:

Hypothesis test Test statistic

Z-test Z-statistic

t-tests t-statistic

ANOVA F-statistic

Chi-square tests Chi-square statistic


T-DISTRIBUTION VS Z-DISTRIBUTION :

• A t-test is primarily used for research with limited sample sizes whereas a z-test is
deployed for hypothesis testing that requires researchers to look at a population size
that's larger than 30.

• Example:
• If you measure the average test score from a sample of only 20 students, you should use
the t-distribution to estimate the confidence interval around the mean. If you use the z-
distribution, your confidence interval will be artificially precise.
SO, SUMMARIZING, THE TEST STATISTICS IS …..

• A test statistic assesses how consistent your sample data are with the null hypothesis in
a hypothesis test.

• Test statistic calculations take the sample data and boil them down to a single number
that quantifies how much your sample diverges from the null hypothesis.

• As a test statistic value becomes more extreme, it indicates larger differences between
your sample data and the null hypothesis.
So by Summarizing, The Test Statistics is …..

• When your test statistic indicates a sufficiently large incompatibility with the null

hypothesis, you can reject the null and state that your results are statistically significant,

and your data support the notion that the sample effect exists in the population.

• To use a test statistic to evaluate statistical significance, you either compare it to a critical

value or use it to calculate the p-value.


ACCEPTANCE AND REJECTION REGIONS

• In hypothesis testing, the test procedure partitions all the possible sample outcomes into

two subsets (on the basis of whether the observed value of the test statistic is smaller

than a threshold value or not).

• The subset that is considered to be consistent with the null hypothesis is called the

"acceptance region"; another subset is called the "rejection region" (or "critical region").
ACCEPTANCE AND REJECTION REGIONS

• Results from a statistical tests will fall into one of two regions:

• The rejection region— which will lead you to reject the null hypothesis, or

• The acceptance region, where you provisionally accept the null hypothesis.
ACCEPTANCE AND REJECTION REGIONS

• If the sample outcome falls into the acceptance region, then the null hypothesis is

accepted.

• If the sample outcome falls into the rejection region, then the null hypothesis is rejected

(i.e. the alternative hypothesis is accepted).


ACCEPTANCE AND REJECTION REGIONS

• The acceptance region is basically the complement of the rejection region; If your result

does not fall into the rejection region, it must fall into the acceptance region.

• The acceptance region is “the interval within the sampling distribution of the test

statistic that is consistent with the null hypothesis H0 from hypothesis testing.”
ACCEPTANCE AND REJECTION REGIONS

• In more simple terms, let’s say you run a hypothesis test like a z-test. The results of the

test come in the form of a z-value, which has a large range of possible values.

• Within that range of values, some will fall into an interval that suggests the null

hypothesis is correct.That interval is the acceptance region.


GENERAL PROCEDURE FOR HYPOTHESIS TESTING

• Hypothesis testing is a formal procedure for investigating our ideas about the world using

statistics.

• It is most often used by scientists to test specific predictions, called hypotheses, that arise

from theories.
GENERAL PROCEDURE FOR HYPOTHESIS TESTING

There are 5 main steps in hypothesis testing:

1. State your research hypothesis as a null hypothesis and alternate hypothesis (Ho) and (Ha or H1).
2. Collect data in a way designed to test the hypothesis.
3. Perform an appropriate statistical test.
4. Decide whether to reject or fail to reject your null hypothesis.
5. Present the findings in your results and discussion section (Conclusion).

Though the specific details might vary, the procedure you will use when testing a hypothesis will always
follow some version of these steps.
STEP 1:
STATE YOUR NULL AND ALTERNATE HYPOTHESIS

• After developing your initial research hypothesis (the prediction that you want to
investigate), it is important to restate it as a null (Ho) and alternate (Ha) hypothesis so
that you can test it mathematically.

• The alternate hypothesis is usually your initial hypothesis that predicts a relationship
between variables. The null hypothesis is a prediction of no relationship between the
variables you are interested in.
EXAMPLE: STEP-1

• You want to test whether there is a relationship between gender and height. Based on

your knowledge of human physiology, you formulate a hypothesis that men are, on

average, taller than women. To test this hypothesis, you restate it as:

• Ho: Men are, on average, not taller than women.


• Ha: Men are, on average, taller than women.
STEP 2:
DATA COLLECTION
• For a statistical test to be valid, it is important to perform sampling and collect data in a way
that is designed to test your hypothesis. If your data are not representative, then you cannot
make statistical inferences about the population you are interested in.

Example:
• To test differences in average height between men and women, your sample should have an
equal proportion of men and women and cover a variety of socio-economic classes and any
other control variables that might influence average height.

• A potential data source in this case might be census data, since it includes data from a variety
of regions and social classes and is available for many countries around the world.
STEP 3:
PERFORM A STATISTICAL TEST

• There are a variety of statistical tests available, but they are all based on the comparison of

within-group variance (how spread out the data is within a category) versus between-group

variance (how different the categories are from one another).

• If the between-group variance is large enough that there is little or no overlap between

groups, then your statistical test will reflect that by showing a low p-value.

• This means it is unlikely that the differences between these groups came about by chance.
STEP 3:
PERFORM A STATISTICAL TEST

• Alternatively, if there is high variance within-group and low between-group variance, then

your statistical test will reflect that with a high p-value.

• This means it is likely that any difference you measure between groups is due to chance.

• Your choice of statistical test will be based on the type of data you collected.
EXAMPLE:
STEP - 3

• Based on the type of data you collected, you perform a one-tailed t-test to test whether
men are in fact taller than women. This test gives you: an estimate of the difference in
average height between the two groups.

• A p-value showing how likely you are to see this difference if the null hypothesis of no
difference is true.

• Your t-test shows an average height of 175.4 cm for men and an average height of 161.7
cm for women, with an estimate of the true difference ranging from 10.2cm to infinity.
The p-value is 0.002.
STEP 4:
DECIDE WHETHER TO REJECT OR FAIL TO
REJECT YOUR NULL HYPOTHESIS

• Based on the outcome of your statistical test, you will have to decide whether to reject

or fail to reject your null hypothesis.

• In most cases you will use the p-value generated by your statistical test to guide your

decision. And in most cases, your predetermined level of significance for rejecting the null

hypothesis will be 0.05 – that is, when there is a less than 5% chance that you would see

these results if the null hypothesis were true.


STEP 4:
DECIDE WHETHER TO REJECT OR FAIL TO
REJECT YOUR NULL HYPOTHESIS

• In some cases, researchers choose a more conservative level of significance, such as 0.01

(1%).This minimizes the risk of incorrectly rejecting the null hypothesis (Type I error).

• In your analysis of the difference in average height between men and women, you find

that the p-value of 0.002 is below your cutoff of 0.05, so you decide to reject your null

hypothesis of no difference.
STEP 5:
PRESENT YOUR FINDINGS

• The results of hypothesis testing will be presented in the results and discussion sections of
your research project or research paper.
• In the results section you should give a brief summary of the data and a summary of the
results of your statistical test (for example, the estimated difference between group means
and associated p-value).
• In the discussion, you can discuss whether your initial hypothesis was supported by your
results or not.
• In the formal language of hypothesis testing, we talk about rejecting or failing to reject the
null hypothesis.
EXAMPLE:
STEP - 5

• Stating results as conclusion in your report.

• In our comparison of mean height between men and women we found an average

difference of 13.7 cm and a p-value of 0.002; therefore, we can reject the null hypothesis

that men are not taller than women and conclude that there is likely a difference in

height between men and women.


EXAMPLES OF HYPOTHESIS TESTING IN REAL
LIFE

• To perform a hypothesis test in the real world, researchers will obtain a random sample from
the population and perform a hypothesis test on the sample data, using a null and alternative
hypothesis:

Null Hypothesis (H0):The sample data occurs purely from chance.

Alternative Hypothesis (HA):The sample data is influenced by some non-random cause.

• If the p-value of the hypothesis test is less than some significance level (e.g. α = .05), then we
can reject the null hypothesis and conclude that we have sufficient evidence to say that the
alternative hypothesis is true.
EXAMPLES OF HYPOTHESIS TESTING IN
REAL LIFE

The following examples provide several situations where hypothesis tests are used in the

real world.
EXAMPLE 1: BIOLOGY

• Hypothesis tests are often used in biology to determine whether some new treatment,

fertilizer, pesticide, chemical, etc. causes increased growth, stamina, immunity, etc. in plants or

animals.

• For example, suppose a biologist believes that a certain fertilizer will cause plants to grow

more during a one-month period than they normally do, which is currently 20 inches.

• To test this, she applies the fertilizer to each of the plants in her laboratory for one month.
• She then performs a hypothesis test using the following hypotheses:

H0: μ = 20 inches (the fertilizer will have no effect on the mean plant growth)
Ha: μ > 20 inches (the fertilizer will cause mean plant growth to increase)

• If the p-value of the test is less than some significance level (e.g. α = .05), then she can
reject the null hypothesis and conclude that the fertilizer leads to increased plant growth.
EXAMPLE 2: CLINICAL TRIALS

• Hypothesis tests are often used in clinical trials to determine whether some new

treatment, drug, procedure, etc. causes improved outcomes in patients.

• For example, suppose a doctor believes that a new drug is able to reduce blood

pressure in obese patients. To test this, he may measure the blood pressure of 50

patients before and after using the new drug for one month.
He then performs a hypothesis test using the following hypotheses:

H0: μafter = μbefore (the mean blood pressure is the same before and after using the drug)
HA: μafter < μbefore (the mean blood pressure is less after using the drug)

• If the p-value of the test is less than some significance level (e.g. α = .05), then he
can reject the null hypothesis and conclude that the new drug leads to reduced blood
pressure.

You might also like