Inferential Statistics 101 - Part 3: Shweta Doshi

Upgrade
Inferential Statistics 101 -part 3

Shweta Doshi Follow
Apr 2, 2018 · 10 min read
Hypothesis Testing
Joey: Good afternoon Chandler.
Chandler: Good afternoon Joe.
Joey: What are we doing?
Chandler: Wasting our lives??
Joey: No, I am asking about the lunch
Chandler: We will try burger today.
Joey: Sounds good. Do you remember the problem which we were

discussing on the Drst day of your job?
Chandler: Yeah, we were trying to answer the question “Does the

background colour of our website aIect the number of clicks which it
receives?”
Number of clicks before changing the colour: 30 per day
Number of clicks after changing the colour: 63.875 per day
Joey: Yeah, you told me that we can’t make a decision just by comparing the
sample averages, right? And that we would get a diIerent sample average
when we rerun our experiments. Till now you explained me about the
central limit theorem, normal distribution and inferential statistics, (Please
revise the previous blogs if you want to know more about those topics) is
there anything else which I should be knowing before solving this problem?
Chandler: Yes, you need to know an important topic in inferential statistics,

which is hypothesis testing.
Joey: Yeah, I do vaguely remember you mentioning about hypothesis

testing during our discussing on central limit theorem.
Chandler: Let me explain what hypothesis testing is? Suppose, assume that
we are going to decide the colour of our website based on the number of
clicks it receives. And the number of clicks in our website is 30 per day for
the default colour. Now, if the number of clicks in our website is going to
increase by changing its colour, we decide to keep the colour permanently.
Otherwise, we will stick to our default one.
Here, we need to decide whether to change the colour of the website based
on the sample data that we have collected, and its mean is 63.875 per day.
We also understand that we can’t make a decision just by comparing two
numbers (63.875 per day > 30 per day) because number of clicks is a
random variable. And this is where hypothesis testing becomes handy. It
tackles these issues in an intelligent way and uses the sample data to make a
decision. In other words, hypothesis testing uses sample data to make an
inference about the population parameter.
Let me give you some examples which uses hypothesis testing to take a
decision
Doctor wanting to know whether children who take vitamin C are less
likely to become ill.
Manufacturer wanting to check if the product’s quality meets the pre-

speciDed criteria.
Scientist wanting to know if teenage boys are more prone to behavioural

problems than teenage girls.
In all these above examples, it is not possible for us analyse the entire
population to arrive at a decision. If the doctor wants to know if the
children who take Vitamin C are less likely to become ill, then it will be very
costly to scrutinise every child in the world to arrive at a decision and
sometimes it becomes infeasible too. So, we always try to make a decision
by looking at a sample from the population.
The following procedure is adapted for conducting hypothesis testing,
1. Formulate Null and Alternative Hypothesis: We need to formulate

two hypothesis which are the null and the alternative hypothesis. Null
hypothesis is usually denoted as H0 and alternative hypothesis as H1.
Null and alternative are two mutually exclusive and collectively
exhaustive statements about a population parameter. We have to make
these statements about the parameter and not about their estimates.
The diIerence between the parameter and an estimate is that the
parameter characterizes the population and the estimate characterizes
the sample. Estimator is a type of statistic (some function of the samples
which we get). Example: Population mean (μ) is a parameter and
sample mean (x) is an estimate. An important thing to keep in mind
while formulating these hypothesis is that the null hypothesis is a
commonly accepted fact (or the default value) and alternative
hypothesis is a statement which a researcher want to test. In our
problem, the null and alternative hypothesis are
H0: Changing the colour of the website doesn’t inbuences the number of
clicks which it receives.
H1: Changing the colour of the website inbuences the number of clicks
which it receives.
If changing the colour of the website has an inbuence on the number of

clicks, then the average number of clicks will change from the default value
(i.e., 30 per day) upon changing in colour. Mathematically it is equivalent
to saying
Please note the following things in our hypothesis
Both hypothesis are mutually exclusive statement i.e., both statements

cannot occur simultaneously and collectively exhaustive i.e. it covers all
possible options.
Made hypothesis for population parameter (μ) and not for estimate (x).
The above hypothesis test is also called as two tailed test. Likewise, there is
another way to formulate the null and the alternative hypothesis. It’s called
one tailed test which are used when our null hypothesis is itself greater or
lesser than some pre speciDed value.
H0: μ ≥30 per day
H1: μ<0 per day
H0: μ ≤30 per day
H1: μ>30 per day
The word ‘null’ in the null hypothesis means that it’s a commonly accepted
fact that statistician work to nullify. We can even call it as falsiDable
hypothesis. This is one of the reason we usually say either we “reject the
null hypothesis” or “fail to reject the null hypothesis” at the end of our
hypothesis testing.
1. Calculate test statistic for the sample data: Test statistic is some
function of the sample data which compares it with the expected value
of the population parameter which would in turn help us to make a
decision in hypothesis testing.
Let me digress a little to explain the mechanism behind the test statistic. Do
you remember CLT (central limit theorem)?
It states that
“The aggregation of a su0ciently large number of independent random

variables results in a random variable which will be approximately a normal
distribution”
Additionally, it also concludes about the parameters of the normal

distribution (i.e., the sampling distribution of sample means) which are:
μ= Population mean (Original/Parent distribution of the observations)
σ = Population standard deviation
x = sample mean
1. The mean of the sampling distribution of sample mean is
E[x] = μ
Where x is the sample mean.
1. The standard error (also called as standard deviation of the theoretical

distribution) of the sampling distribution is given by
(Please revisit our Sampling distribution of sample mean and central limit
theorem blog to understand the intuition behind the theorem)
The CLT helps us to create a distribution for null hypothesis (population

parameter) which is called as the null distribution. Assume that we knew
the population standard deviation of the observations (σ). Then, the
distribution for the null hypothesis is given by the CLT, which is N(μ,σ2n) .
In our problem, assume that we have found the population standard
deviations which turns out to be 150 and the sample size is 8 (Collected 8
observations –Table 1). So, the null distribution for our problem is
N(30,258) (the distribution of the sample means) Therefore, the pdf for the
null distribution is
I think now you are ready to understand how test statistic compares the
sample data with the expected value of the null hypothesis or the
population parameter. There are a variety of test statistics which are
selected based on some criteria. In our case, we will be using Z- statistic.
They are used when the following conditions hold true
Performing hypothesis testing on the population mean
We assume that we knew the population standard deviation (σ )
Z-statistic is deDned as
The above formula must be familiar to you, it is just a way to convert a

normal distribution into a standard normal distribution Z. The only
diIerence here is that we are using σn in the denominator instead of σ. The
reason being that the standard deviation of sampling distribution of sample
mean (theoretical distribution of sample mean) is σn. we also knew that the
standard normal distribution can be seen as a scaled and normalized
version of a normal distribution. The below Dgures depicts the same.
To calculate z-statistic, we knew all the values except the sample mean (X).
In our example the collected sample was:
The value of z-statistic (3.83) tells us how much far away the sample mean
is from the null hypothesis mean (Positive value corresponds to sample
mean being higher and negative value corresponds to population mean
being higher).
1. Calculate p-value (probability value): The Z-statistic tells the distance

between the sample mean and the hypothesized mean (H0) in terms of
standard deviation. A Z score of 3.83 indicates that sample mean is 3.83
standard deviations away from the hypothesized mean. From the
normal distribution blog we know that:
2. We can calculate the probability for a given Z-score.
3. A very high or a very low (negative) Z scores corresponds to a very small

probability value and are found near the tails of a normal distribution.
Once the Z score is computed the corresponding probability value is

obtained from the Z table (procedure outlined in the normal distribution
blog). Higher the probability value, higher is the probability of observing
the collected sample from the theoretical distribution, this is because X is a
sample from the same theoretical distribution (N(30,258)). A low
probability value indicates that X might have come from other distribution.
1. Decision Making based on the signiCcance level: Till now we have

formulated the hypothesis, calculated the test statistic and computed the
p-value. As mentioned earlier the p-value is the probability of observing
the collected sample from the theoretical distribution or in other words
it is the probability of obtaining a result at least as extreme as the one
observed (sample) assuming that the null hypothesis is true. To put it in
a nut shell the probability value gives the probability of observing the
collected sample in the null distribution. In order to make a binary
decision in the hypothesis testing i.e. either rejecting the null hypothesis
or failing to reject the null hypothesis, the p-value alone is not enough,
we need to Dx a threshold in the theoretical distribution. For now let’s
assume that the threshold is 0.1
p-value > 0.1 (Decision: Fail to reject the null hypothesis)
p-value < 0.1 (Decision: Reject the null hypothesis)
The number 0.1 is called as the signiDcance level which typically represents
the level of acceptable error in our decision. Generally, the decisions are
made as follows:
p-value > α (Decision: Fail to reject the null hypothesis)
p-value < α (Decision: Reject the null hypothesis)
Where α is the signiDcance level. Statisticians and Researcher have to

decide on a value of α before conducting the hypothesis testing. Typical
values for α are 0.1, 0.05, and 0.01. Let us calculate p-value for our problem
which is two tailed Z-test,
Let us assume a α value of 0.10. Now, since the p value is less than α we
reject the null hypothesis. In other words, changing the colour of our
website has inbuences on the number of clicks.
Joey: Wow, I didn’t realize the importance of hypothesis testing until this
conversation. But I still have a doubt…Why don’t we say “accept the
alternate hypothesis” instead “reject the null hypothesis” since both means
the same?
Chandler: When our p-value is very low, the only conclusion we can make is
that X is not from the theoretical or the null distribution. We can’t make any
conclusion about the alternative hypothesis. And that’s why we always say
reject null hypothesis instead of accepting the alternate hypothesis.
Joey: Make sense. If my understanding about the p-value is correct, the

computation of p-value calculation is diIerent for one tailed and two tailed
test. Am I right?
Chandler: oh yes, you are absolutely correct. P-value is the probability of

obtaining a result at least as extreme as the one observed (sample). Let ‘a’
be the z-score.
Joey: Okay, based on our conversation, I understand that p-value is the

probability of observing the collected sample in the theoretical distribution
or the null distribution. Now, for a one tailed test I should be calculating
P(Z=a) and not P(Z≤a) right?
Chandler: Let me put it this way, we know that the probability of observing
a sample in a normal distribution is zero since it is a continuous
distribution. Therefore, it is not possible to calculate the probability of
observing the sample we collected in the null distribution as it always
results in zero. That is the reason why we deDne the p-value as probability
of obtaining a result at least as extreme as the one observed (sample).
Chandler: Okay let’s pull the data from our database and run the hypothesis
to check if change in colour of website has any inbuence on the no. of clicks
which it receives.
Let red be the default colour and blue be the new colour to which we
change the website, also assume that we know that the average number of
clicks is 30 when the colour red (from historical data), now we change the
colour to blue and record the number of clicks per day for 8 days (assume
σ=25 and a=0.05 )
No. of clicks per day when the background colour is Blue:
The corresponding p-value is 0.0001 which is less than the threshold so we

reject the null hypothesis and conclude that changing the colour of website
inbuences the number of clicks.
The author of this blog is Balaji P who is pursuing PhD in reinforcement

learning at IIT Madras
Quora- www.quora.com/proAle/Balaji-Pitchai-Kannu
Statistics
34 claps
WRITTEN BY
Shweta Doshi Follow
I am an unapologetic idealist who believes that to gain quality

education,we need to transform the way we teach & learn.I am
the Co-Founder at www.greyatom.com
GreyAtom Follow
GreyAtom is committed to building an educational ecosystem

for learners to upskill & help them make a career in data
science.
Write the =rst response
Discover Medium Make Medium yours Become a member

Welcome to a place where words matter. On Follow all the topics you care about, and we’ll Get unlimited access to the best stories on
Medium, smart voices and original ideas take deliver the best stories for you to your homepage Medium — and support writers while you’re at it.
center stage - with no ads in sight. Watch and inbox. Explore Just $5/month. Upgrade
About Help Legal

Inferential Statistics 101 - Part 3: Shweta Doshi

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Inferential Statistics 101 - Part 3: Shweta Doshi

Uploaded by

Copyright:

Available Formats

Upgrade

Inferential Statistics 101 -part 3

Joey: Good afternoon Chandler.

Chandler: Good afternoon Joe.

Joey: What are we doing?

Chandler: Wasting our lives??

Joey: No, I am asking about the lunch

Chandler: We will try burger today.

Joey: Sounds good. Do you remember the problem which we were

Chandler: Yeah, we were trying to answer the question “Does the

Number of clicks before changing the colour: 30 per day

Number of clicks after changing the colour: 63.875 per day

Chandler: Yes, you need to know an important topic in inferential statistics,

Joey: Yeah, I do vaguely remember you mentioning about hypothesis

Manufacturer wanting to check if the product’s quality meets the pre-

Scientist wanting to know if teenage boys are more prone to behavioural

The following procedure is adapted for conducting hypothesis testing,

1. Formulate Null and Alternative Hypothesis: We need to formulate

If changing the colour of the website has an inbuence on the number of

Please note the following things in our hypothesis

Both hypothesis are mutually exclusive statement i.e., both statements

H0: μ ≥30 per day

H1: μ<0 per day

H0: μ ≤30 per day

H1: μ>30 per day

“The aggregation of a su0ciently large number of independent random

Additionally, it also concludes about the parameters of the normal

μ= Population mean (Original/Parent distribution of the observations)

σ = Population standard deviation

1. The mean of the sampling distribution of sample mean is

Where x is the sample mean.

1. The standard error (also called as standard deviation of the theoretical

The CLT helps us to create a distribution for null hypothesis (population

Performing hypothesis testing on the population mean

We assume that we knew the population standard deviation (σ )

The above formula must be familiar to you, it is just a way to convert a

1. Calculate p-value (probability value): The Z-statistic tells the distance

2. We can calculate the probability for a given Z-score.

3. A very high or a very low (negative) Z scores corresponds to a very small

Once the Z score is computed the corresponding probability value is

1. Decision Making based on the signiCcance level: Till now we have

p-value > 0.1 (Decision: Fail to reject the null hypothesis)

p-value < 0.1 (Decision: Reject the null hypothesis)

p-value > α (Decision: Fail to reject the null hypothesis)

p-value < α (Decision: Reject the null hypothesis)

Where α is the signiDcance level. Statisticians and Researcher have to

Joey: Make sense. If my understanding about the p-value is correct, the

Chandler: oh yes, you are absolutely correct. P-value is the probability of

Joey: Okay, based on our conversation, I understand that p-value is the

No. of clicks per day when the background colour is Blue:

The corresponding p-value is 0.0001 which is less than the threshold so we

The author of this blog is Balaji P who is pursuing PhD in reinforcement

Shweta Doshi Follow

I am an unapologetic idealist who believes that to gain quality

GreyAtom is committed to building an educational ecosystem

Write the =rst response

Discover Medium Make Medium yours Become a member

About Help Legal

You might also like