You are on page 1of 24

Z-test for Large Samples

by
Dr. Rajesh Moharana

Department of Mathematics
School of Advanced Sciences
Vellore Institute of Technology
Vellore, India

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 1 / 24


Z -test

Z -test is used for comparing the mean of a sample to some hypothesized


mean for the population in case of a large sample. It is also used when
the population variance is known. Z -test is also used for comparing the
sample proportion to a theoretical value of population proportion.

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 2 / 24


Procedure for Hypothesis Testing
The main question in hypothesis testing is whether to accept the null
hypothesis or not to accept the null hypothesis. The following tests are
involved in hypothesis testing.
1. Stating of the hypothesis
The null hypothesis H0 , and the alternative hypothesis H1 are con-
structed in this step.
2. Identification of test statistics
For large samples, when population n’s standard deviation is known, the
z-test statistics is used.
3. Specify the level of significance
A very important concept in hypothesis testing is the level of significance.
It may be 1% or 5% or 10% (i.e., α = 0.01 or 0.05 or 0.1).
4. Determine the value of the test statistic.
5. Check whether the probability value is less than or equal to α and
accordingly reject the null hypothesis, or accept it.
MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 3 / 24
Critical Values

The value of the test statistics which separates the critical region and the
acceptance region is called critical value. It is also called the significant
value and is depend on
1 the level of significance and
2 the alternative hypothesis
The following table gives us the critical values of Z .

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 4 / 24


Decision of rejecting the null hypothesis or not rejecting
the null hypothesis

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 5 / 24


Test of Significance:Large Samples

Test of significance for single mean


Test of significance for difference of means of two large samples
Test of significance for a single proportion
Test of significance for difference of proportions

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 6 / 24


Test of significance for single mean

Let x1 , x2 , . . . , xn be a random sample of size n, drawn from a large


population with mean µ and variance σ 2 . Let x denote the mean of the
sample and s denote the standard deviation of the sample. We know
2
that x ∼ N(µ, σn ). The standard normal variate corresponding to x is
x−µ √σ
Z = S.E .(x) , where S.E .(x) = n .

We set up the null hypothesis that there is no difference between the


sample mean and the population mean. The test statistic is

x−µ
Z= √ ,
σ/ n
if σ is known.

x−µ
Z = s/ √ , if σ is not known. Here, s is the standard deviation of the
n
sample.

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 7 / 24


Conti...
Problem 1: The heights of college students in a city are normally dis-
tributed with S.D. 6cms. A sample of 100 students have mean height
158cms. Test the hypothesis that the mean height of college students
in the college is 160 cms.

Solution: We have x = 158 (mean of the sample)


µ = 160 (mean of the population), σ = 6, n = 100.
Level of significance: 5%
H0 : µ = 160, i.e, difference is not significant.
H1 : µ 6= 160
We apply the two tailed test. Test statistic is
x −µ 158 − 160
Z= √ = √ = −3.33.
σ/ n 6/ 100

∴ |Z | = 3.333

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 8 / 24


Conti...

Table value of Z at 5% level of significance=1.96.


Since calculated value of Z at 5% level of significance is greater than
the table value of Z , we reject H0 at 5% level of significance.

Problem 2: A sample of 400 items is taken from a population whose


standard deviation is 10. The mean of the sample is 40. Test whether
the sample has come from the population with mean 38. Also calculate
95% confidence interval for the population mean.

Solution: Null hypothesis: H0 : µ = 38


Alternative hypothesis: H1 : µ 6= 38
Level of significance: α = 0.05
Test statistic:
x −µ 40 − 38
Z= √ = √ = 4.
σ/ n 10/ 400

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 9 / 24


Conti...

Calculated value of Z at 5% level of significance is greater than the table


value of Z , we reject H0 at 5% level of significance.
95% confidence interval for the population mean is given by
σ  10 
x ± zα √ = 40 ± 1.96 × √ = [39.02, 40.98].
n 400
Problem 3: The mean of a certain production process is known to be 50
with a standard deviation of 2.5. The production manager may welcome
any change in the mean value towards the higher side but would like
to safeguard against decreasing values of mean. He takes a sample of
120 items that gives a mean value of 46.5. What inference should the
manager take for the population process on the basis of sample results.
Use 5% level of significance for the purpose.

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 10 / 24


Conti...

Solution: Null hypothesis: H0 : µ = 50


Alternative hypothesis: H1 : µ < 50
Level of significance: α = 0.05
Test statistic:
x −µ 46.5 − 50
Z= √ = √ = −4.854.
σ/ n 2.5/ 120

|z| = 4.854. Table value of z = 1.645.


Calculated value of Z at 5% level of significance is greater than the table
value of Z , we reject H0 at 5% level of significance.

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 11 / 24


Test of significance for difference of means of two large
samples

Let x1 be the mean of an independent random sample of size n1 from a


population with mean µ1 and variance σ12 . Again, let x2 be the mean of
an independent random sample of size n2 from a population with mean
µ2 and variance σ22 . Here, n1 and n2 are large. Clearly,
 σ2   σ2 
x1 ∼ N µ1 , 1 and x2 ∼ N µ2 , 2 .
n1 n2
Hence, under the null hypothesis H0 : µ1 = µ2 , the test statistic is
x1 − x2
z=q 2 .
σ1 σ22
n1 + n2

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 12 / 24


Conti...

If σ1 = σ2 = σ, then the test statistic is


x1 − x2
z = q .
1 1
σ n1 + n2

If σ1 and σ2 are not known, the test statistic is


x1 − x2
z=q 2 .
S1 S22
n1 + n2

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 13 / 24


Conti...

If σ1 = σ2 = σ and σ is not known, we compute σ 2 by using the


formula
n2 s 2 + n22 s22
σ2 = 1 1 .
n1 + n2
In this case, the test statistic is
x1 − x2
z=r .
n12 s12 +n22 s22

1 1
n1 +n2 n1 + n2

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 14 / 24


Conti...

Problem: Random samples drawn from two places gave the following
data relating to the heights of children

Place A Place B
Mean height 68.50 68.58
Standard deviation 2.5 3.0
Sample size 1200 1500

Test at 5% level that the mean height is the same for the children at
two places.

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 15 / 24


Conti...

Solution: x1 = 68.50, x2 = 68.58,


n1 = 1200, n2 = 1500
σ1 = 2.5, σ2 = 3.0
Level of significance= 0.05
Null hypothesis: H0 : µ1 = µ2
Alternative hypothesis: H1 : µ1 6= µ2
x1 − x2
Z=q 2 = −0.756.
σ1 σ22
n1 + n2

∴ |z| = 0.756 < zα = 1.96. Hence, null hypothesis is accepted.

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 16 / 24


Test of significance for a single proportion

Consider a sample which is drawn from a population with a population


proportion (percentage) P. If n denotes the sample and p is the sample
proportion or percentage, then we have
r
P(100 − P)
Standard error percentage =
n
The test statistic is

p−P
z= q ∼ N(0, 1)
PQ
n

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 17 / 24


Conti...

p−P
If |z| = q
PQ
, then the difference of proportions (p−P) is not significant
n
at the level of significance α.
The probable limits forqobserved proportions of success, i.e, sample
proportion are P ∓ zα PQ n .
If P is unknown, the confidence
q limits for the proportion in the
pq
population are p ∓ zα n , where q = 1 − p.
If the q
proportion is not given, the 99% confidence limits forqp are
pq
P ∓ 3 PQ n and the 99% confidence limits for P are p ∓ 3 n

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 18 / 24


Conti...
Problem : If 200 people were attacked by a disease and only 180
survived, will you reject the hypothesis that the survival rate if attacked
by this disease is 85% in favour of the hypothesis that is more at 5% level.

Solution: Number of people survived=x = 180.


Size of the sample = n = 200.
p=Population of the people survived= xn = 180
200 = 0.9 It is given that
= P = 85% = 0.85.
Q = 1 − P = 1 − 0.85 = 0.15
Null hypothesis: H0 : P = 0.85
Alternative hypothesis: H1 : P > 0.85
Level of significance = α = 0.05
p−P
Test statistic: z = q PQ
= 0.6265.
n
Table value of z = 1.645. Calculated value of z is less than the table
value of z at 5% level of significance. Null hypothesis is accepted.

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 19 / 24


Test of significance for difference of proportions

Suppose two samples of sizes n1 and n2 are drawn from two different
populations. To test the significance of difference between the two pro-
portions, we consider the following cases.
Case-I When the population proportions P1 and P2 are known:
In this case Q1 = 1 − P1 and Q2 = 1 − P2 . The test statistic is
P1 − P2
Z=q .
P1 Q1 P2 Q2
n1 + n2

Case-II When the population proportions P1 and P2 are not known but
sample proportions p1 and p2 are known:
The test statistic is
p1 − p2
Z=q .
p1 q1 p2 q2
n1 + n2

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 20 / 24


Conti...

Case-III Method of pooling:


In this method, the sample proportions P1 and P2 are pooled into a single
proportion P, by using the formula:
n1 P1 + n2 P2
P=
n1 + n2
The test statistic in this case is
P1 − P2
Z=r  .
PQ n11 + 1
n2

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 21 / 24


Conti...

Problem: A machine puts out 16 imperfect articles in sample of 500.


After the machine is overhauled, it puts out 3 imperfect articles in a
batch of 100. Has the machine improved?
16 3
Solution: Here we have p1 = 500 = 0.032, p2 = 100 = 0.03,
q1 = 1 − p1 = 0.968 and q2 = 1 − p2 = 0.970.
Null hypothesis: H : P1 = P2
Alternative hypothesis: H : P1 < P2
Level of significance=0.05
The test statistic is z = q pp1 q11−p2p2 q2 = 0.107
n1
+ n2
Table value of z = 1.645. Since the calculated value of z at 5% level of
significance is less than the table value of z, we accept the null hypoth-
esis.

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 22 / 24


References

Vijay K. Rohatgi and A.K. Md. Ehsanes Saleh (2003)


An Introduction to Probability and Statistics
Wiley Series in Probability and Statistics
Kapoor, V.K. and Gupta, S.C., (1980)
Fundamentals of Mathematical Statistics
Sultan Chand & Sons
Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers and Keying Ye
(2012)
Probability & Statistics for Engineers & Scientists
Pearson Education
T Veerarajan (2017)
Probability - Statistics and Random Processes
McGraw Hill Education
Rao G. S. (2011)
Probability and Statistics for Science and Engineering
Universities Press

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 23 / 24


Thank You

MAT 2001 (Statistics for Engineers) rajesh.moharana@vit.ac.in 24 / 24

You might also like