Probability and Statistics - 3

Probability & Statistics
17.07.2020
Anh Tuan Tran (Ph.D.) & Thinh Tien Nguyen (Ph.D.)
1. Central Limit Theorem
Central Limit Theorem
Theorem:
 Let X1 , … , X n be i.i.d. random variables with expected
value E Xi = μ and variance 0 < D Xi = σ2 < +∞
for i = 1, … , n.
 Then, the random variable
X − μ X1 + ⋯ + Xn − nμ
Zn ≔ σ =
nσ
n
converges in distribution to the standard normal
random variable as n → +∞.
Example:
 Toss a fair coin n times.
 Let X i be 1 if Head occurs and 0 if Tail occurs in the ith
toss for i = 1, … , n.
 E X i = p = 0.5 and D X i = p(1 − p) for i = 1, … , n.
X1 + ⋯ + Xn − np
Zn ≔
np 1 − p
converges in distribution to the standard normal random
variable as n → +∞.
 Binom n, p converges to N np, np 1 − p as n → +∞.
n=2
n=5
n = 30
Example:
 Roll n dice.
 Let X i be the number occurs on the ith die for i = 1, … , n.
7 35
 E Xi = and D Xi = for i = 1, … , n.
2 12
7
X1 + ⋯ + X n − n
Zn ≔ 2
35
n
12
converges in distribution to the standard normal random
variable as n → +∞.
7 35
 Zn converges to N n, n as n → +∞.
2 12
n=1
n=2
n=8
Example:
A bank teller serves customers standing in the queue

one by one. Suppose that the service time Xi for
customer i has mean E Xi = 2 (minutes) and D Xi =
1. We assume that service times for different bank
customers are independent. Let Y be the total time the
bank teller spends serving 50 customers.
Find the probability that the bank teller spends from 90

to 110 minutes for the customers.
Y = X1 + ⋯ + X50
P 90 ≤ Y ≤ 110
90 − 2 ⋅ 50 Y − 2 ⋅ 50 110 − 2 ⋅ 50
=P ≤ ≤
50 50 50
≈ P − 2 ≤ Z ≤ 2 ≈ 0.8427
Where Z~N 0,1 .

2. Hypothesis testing
Hypothesis testing
Motivation problem 1:
 Select randomly 100 people in a city to compute the
average height.
 Repeat the above steps for few times, and the
records of the average height is a sequence
approximating 1.65 m.
 The average height of the people in the city is
exactly 1.65 m or not?
Hypothesis testing
Motivation problem 2:
 A dataset of the final scores of a group of 300
students.
 The group can be divided into 2 subgroups of boys
and girls.
 Compute the average score of each subgroup, boys:
7.91/10 and girls: 6.96/10
 Sex affects the performance of the students? i.e. the
difference between the average scores is
significant?
Null and alternative hypotheses
Null hypothesis:
 The hypothesis that is often the opposite of our
guess.
 Denoted by H0 .
Alternative hypothesis:
 The hypothesis that is often consistent with our
guess and is opposite to the null hypothesis.
 Denoted by Ha or H1 .
Example 1 (One-sample test):
Two-tailed test:
The average height of the people of the city is exactly
1.65 m?
H0 : μ = 1.65
Ha : μ ≠ 1.65
One-tailed test (Right or left):
The average height of the people of the city is less
than or equal to (greater than or equal to) 1.65 m?
H0 : μ ≤ 1.65 H0 : μ ≥ 1.65
or
Ha : μ > 1.65 Ha : μ < 1.65
Example 2 (Two-independent-samples test):

Sex affects the performance of the students? i.e. the
difference between the average scores of the boys
and the girls in the class is really significant?
Two-tailed test:
H0 : μ1 = μ2
Ha : μ1 ≠ μ2
One-tailed tests (Right or left):
H0 : μ1 ≤ μ2 H0 : μ1 ≥ μ2
or
Ha : μ1 > μ2 Ha : μ1 < μ2
Test statistic
Definition:
 A test statistic is the output of a scalar function of all

the observations (data).
 The test statistic is constructed based on the
assumption the null hypothesis H0 is true.
Test statistic
Example (One-sample test):
 We have the heights of 150 people in a city.
 A test statistic of the test of the average height of the
people in the city is 1.65 m is
x − 1.65
t≔ s
150
where
 x is the average height of the sample,
 s is the adjusted standard deviation of the sample.
Test statistic
Why t?
 If the null hypothesis H0 is true and the average height of
the people in the city is exactly μ0 = 1.65 (m).
 By the central limit theorem, for n large enough
X − μ0
T≔ ~N 0,1
S/ n
where
 X is the random variable, the possible values of X are the
average heights of every sample of size n taken from the
people in the city,
 S is the random variable, the possible values of S are the
adjusted standard deviations of every sample of size n
taken from the people in the city.
Test statistic
p-value
Definition:
 Assume H0 is true.
 Let T be a test statistic random variable deduced
from H0 .
 Let t be the observed test statistic from the data.
 Then
 Right tests: p-value = P(T ≥ t|H0 ),
 Left tests: p-value = P T ≤ t|H0 ,
 Two-tailed tests:
p-value = 2 min P T ≥ t|H0 , P(T ≤ t|H0 )
p-value
Right-tailed test
p-value
p-value
Left-tailed test
p-value
p-value
Two-tailed test p-value=2 times the

min
Two types of errors
Definition:
 Type I (False positive): reject H0 when it is actually true.

 Type II (True negative): accept H0 when it is actually
false.
Example:
 Type I: Reject the hypothesis that the average height of

the people in the city is 1.65 m when it is exactly 1.65 m.
 Type II: Accept the hypothesis that the average height of
the people in the city is 1.65 m when it is not the case.
Significance level
Definition:
 Assume H0 is true.
 The probability that H0 will be rejected is called a
significance level.
 Denoted by α.
Example:
α = 0.05 indicates a 5% risk of concluding that rejecting the

1.65 m average height of the people in the city when it is
exactly the case.
Significance level
Right-tailed test
Significance level
Left-tailed test
Significance level
Two-tailed test
Accepting H0
If the p-value is larger

the significance level α,
we accept H0 .
Otherwise, reject it.
Accepting H0
Example:
H0 : μ ≤ 1.65
Ha : μ > 1.65
 P T ≥ t|H0 = p-value > α: Accept H0 .

 P T ≥ t|H0 = p-value < α: Reject H0 .
X−1.65 x−1.65
T= S ~N(0,1) and t = s .
n n
where x and s are resp. mean and adjusted standard

deviation from size n (large) sample of observed data.
3. Useful tests
One-sample t-test
When we want to test the hypothesis of the mean of

the whole population is not equal to a constant μ0 .
In two-tailed tests:
H0 : μ = μ0
Ha : μ ≠ μ0
In one-tailed tests:
H0 : μ ≤ μ0 H0 : μ ≥ μ0
or
Ha : μ > μ0 Ha : μ < μ0
One-sample t-test
Test statistic:
x − μ0
t≔ s ~T n − 1
n
 x is the mean of the sample.

 s is the adjusted standard deviation of the sample.
 The sample size n > 30 or we do need to assume
the normal distribution of the whole population.
 T n − 1 is the Student’s t distribution with freedom
degree n − 1.
Student’s t-distribution
Density function:
X~T n . Then
n+1 n+1
Γ x2
−
2
2
f x = n 1+ .
nπΓ n
2
Gamma function:
+∞
Γ x ≔ y x−1 e−y dy.

0
Student’s t distribution
Property:
Let
X~T n .
Then for n ≥ 30,
X~N 0,1 .
One-sample t-test
Example:
In a manufactory, a machine is used to package sugar

for each 1 kg. To check if it works properly, workers
select 100 packages randomly with the weights as
follows.
Weight 0.95 0.97 0.99 1.01 1.03 1.05

#Packages 9 31 40 15 3 2
t ≈ −6.92, p − value ≈ 4.522e − 10

Two-independent-samples t-test
When we want to compare the means μ1 and μ2 of two

independent samples.
H0 : μ1 = μ2
Ha : μ1 ≠ μ2
H0 : μ1 ≤ μ2 H0 : μ1 ≥ μ2
or
Ha : μ1 > μ2 Ha : μ1 < μ2
Test statistic (Equal variances):
x1 − x2
t≔ ~T(n1 + n2 − 2)
1 1
sp +
n1 n2
 x1 , x2 is the means of the samples.

 s1 , s2 are the adjusted standard deviations of the samples
2 2
n1 − 1 s1 + n 2 − 1 s 2
sp2 ≔
n1 + n2 − 2
 The sample sizes n1 , n2 > 30 or we need to assume the
normal distribution on each group.
 The two samples are independent (Otherwise, another
test is applied).
Test statistic (Unequal variances):
x1 − x2
t≔ ~T(df)
s12 s22
+
n1 n2
 x1 , x2 is the means of the samples.

 s1 , s2 are the adjusted standard deviations of the
samples.
 The sample sizes n1 , n2 > 30 or we need to assume the
normal distribution on each group.
 The two samples are independent (Otherwise, another
test is applied).
 Degree of freedom:
2 2 2
s1 s2
+
n1 n2
df ≔
2 2 2 2
1 s1 1 s2
+
n1 − 1 n1 n2 − 1 n2
Two-independent samples test
Example:
In order to compare the average weights of rural and urban

births, 10000 births were weighed. Here is the summary table.
Region #Births Average weight (Adjusted) standard deviation

Rural 8000 3.0 kg 0.3 kg
Urban 2000 3.2 kg 0.2 kg
Equal variances:
t ≈ −28.23, df ≈ 9998, p − value ≈ 2.26e − 1.69
Not equal variances:

t ≈ −35.77, df ≈ 4523, p − value ≈ 4.46e − 247
F-test (two independent samples)
When we want to compare the variances σ1 and σ2 of

two independent samples.
H0 : σ12 = σ22
Ha : σ12 ≠ σ22
H0 : σ12 ≤ σ22 H0 : σ12 ≥ σ22
or
Ha : σ12 > σ22 Ha : σ12 < σ22
Test statistic:
s12
f ≔ 2 ~F n1 − 1, n2 − 1
s2
 si is the adjusted standard deviation of the ith

sample of size ni for i = 1,2.
 F n1 − 1, n2 − 1 is the (Fisher-Snedecor’s) F
distribution with freedom degree n1 − 1 and n2 − 1.
F-distribution
Density function:
X~F m, n . Then for x > 0

m+n m n m
Γ m 2 n2 x 2 −1
f x = 2 .
m n m+n
Γ Γ n + mx 2
2 2
F-distribution
Example:
In order to compare the average weights of rural and

urban births, 10000 births were weighed. Here is the
summary table.
Region #Births Average weight (Adjusted) standard deviation

Rural 8000 3.0 kg 0.3 kg
Urban 2000 3.2 kg 0.2 kg
f ≈ 2.25, df1 = 7999, df2 = 1999 ,

p − value ≈ 1.11e − 16
One-way ANOVA (Analysis of variances)
When we want to compare the means μi of more than

two independent samples.
H0 : μ1 = ⋯ = μk
Ha : ∃i ≠ j, μi ≠ μj
 ni : the number of observed data of the ith group.
 The observed data of the ith group are denoted by
xi1 , xi2 , … , xini
 The average of the ith group:
ni
1
xi ≔ xij .
ni
j=1
 The adjusted variance of the ith group:
ni
2 1 2
si ≔ xij − xi .
ni − 1
j=1
 n: the number of observed data.

 The average of the whole sample:
k ni
1
x≔ xij .
n
i=1 j=1
 The adjusted variance of the whole sample:
k ni
1 2
s2 ≔ xij − x .
n−1
i=1 j=1
 The total sum of squares:
k ni
2
SST ≔ xij − x .
i=1 j=1
 The sum of squares within groups:
k ni
2
SSE ≔ xij − xi .
i=1 j=1
 The sum of squares between groups and x:
k
2
SSA ≔ ni xi − x = SST − SSE.
i=1
Test statistic:
n − k SSA
f≔ ⋅ ~F k − 1, n − k
k − 1 SSE
where F k − 1, n − k is the (Fisher-Snedecor’s) F

distribution with freedom degree k − 1 and n − k.
Comments:
 The groups are independent.

 The size of each group is large enough or they have
normal distribution.
 Equal variances must be assumed.
Example:
Amount of Alcaloid (mg) in a new herb in the three

regions.
Region A: 7.5, 6.8, 7.1, 7.5, 6.8, 6.6, 7.8

Region B: 5.8, 5.6, 6.1, 6.0, 5.7
Region C: 6.1, 6.3, 6.5, 6.4, 6.5, 6.3
f ≈ 26.56, df1 = 2, df2 = 15 ,

p − value ≈ 1.17e − 05

Probability and Statistics - 3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability and Statistics - 3

Uploaded by

Copyright:

Available Formats

Probability & Statistics

A bank teller serves customers standing in the queue

Find the probability that the bank teller spends from 90

Where Z~N 0,1 .

Example 2 (Two-independent-samples test):

 A test statistic is the output of a scalar function of all

Two-tailed test p-value=2 times the

 Type I (False positive): reject H0 when it is actually true.

 Type I: Reject the hypothesis that the average height of

α = 0.05 indicates a 5% risk of concluding that rejecting the

If the p-value is larger

 P T ≥ t|H0 = p-value > α: Accept H0 .

where x and s are resp. mean and adjusted standard

When we want to test the hypothesis of the mean of

 x is the mean of the sample.

Γ x ≔ y x−1 e−y dy.

Then for n ≥ 30,

In a manufactory, a machine is used to package sugar

Weight 0.95 0.97 0.99 1.01 1.03 1.05

t ≈ −6.92, p − value ≈ 4.522e − 10

When we want to compare the means μ1 and μ2 of two

 x1 , x2 is the means of the samples.

 x1 , x2 is the means of the samples.

In order to compare the average weights of rural and urban

Region #Births Average weight (Adjusted) standard deviation

Not equal variances:

When we want to compare the variances σ1 and σ2 of

 si is the adjusted standard deviation of the ith

X~F m, n . Then for x > 0

In order to compare the average weights of rural and

Region #Births Average weight (Adjusted) standard deviation

f ≈ 2.25, df1 = 7999, df2 = 1999 ,

When we want to compare the means μi of more than

 n: the number of observed data.

where F k − 1, n − k is the (Fisher-Snedecor’s) F

 The groups are independent.

Amount of Alcaloid (mg) in a new herb in the three

Region A: 7.5, 6.8, 7.1, 7.5, 6.8, 6.6, 7.8

f ≈ 26.56, df1 = 2, df2 = 15 ,

You might also like