You are on page 1of 9

INTERNATIONAL UNIVERSITY (IU) Engineering Probability & Statistic

ISE Department Lecturer: Phan Nguyễn Kỳ Phúc


--------------------o0o------------------

TWO POPULATIONS HYPOTHESIS TESTING


Paired-Observations: One object is measured twice

Independent Samples: Observe different groups of persons or things, at different times or under different sets of circumstances

Comparison of Mean Using Paired-Observations


2-TAILED TEST

std (σD) of population is GIVEN std (σD) of population is NOT GIVEN

Sample size: n Small sample size (n1, n2<30) Large sample size (n1, n2≥30)

 H 0 : D  D  H 0 : D  D  H 0 : D  D
Hypothesis   
 H1 :  D  D  H1 :  D  D  H1 :  D  D
  D2   sD2   sD2 
Distribution of x  D,   D,   D, 
 n  n  dof  n  1 n 
z-distribution: t-distribution:  , z-distribution: 
DD DD DD
Test statistic zt  tt  zt 
D n sD n sD n

Critical Values  z 2 t 2,n 1  z 2


Reject Ho if : Reject Ho if : Reject Ho if :
Decision zt  z 2 or zt   z 2 tt  t 2,n 1 or tt  t 2,n 1 zt  z 2 or zt   z 2
Confident interval D sD sD
of population D  z D  t D  z
mean 2 n 2
, n 1 n 2 n
INTERNATIONAL UNIVERSITY (IU) Engineering Probability & Statistic
ISE Department Lecturer: Phan Nguyễn Kỳ Phúc
--------------------o0o------------------

1-TAILED TEST

std (σD) of population is GIVEN std (σD) of population is NOT GIVEN

Sample size: n Small sample size (n<30) Large sample size (n≥30)

 H 0 :  D  D(  D  D)  H 0 :  D  D(  D  D)  H 0 : D  D( D  D)
Hypothesis   
 H1 :  D  D (  D  D )  H1 :  D  D (  D  D )  H1 :  D  D (  D  D )
  D2   sD2   sD2 
Distribution of D  D,   D,   D, 
n  n  dof  n  1 n 
z-distribution:  t-distribution:  , z-distribution: 
DD DD DD
Test statistic zt  tt  zt 
D n sD n sD n

Critical Values  z ( z ) t ,n1 (t ,n 1 )  z ( z )


Reject Ho if : Reject Ho if : Reject Ho if :
Decision zt   z ( zt  z ) tt  tn 1,  (tt  tn1,  ) zt   z ( zt  z )
Confident interval D sD sD
of population D  z D  t D  z
mean 2 n 2
, n 1 n 2 n
Comparison of Mean Using Independent Sampling
2-TAILED TEST

std (σ1) and std (σ2) of 2 std (σ1) and std (σ2) of 2 populations are NOT GIVEN
populations are GIVEN

Equal Variance assumption is NOT Equal Variance assumption is


Sample size: n1, n2 Large sample (n1, n2 ≥30)
GIVEN & (n1, n2 <30) GIVEN & (n1, n2 <30)

 H 0 : 1  2  D  H 0 : 1  2  D  H 0 : 1   2  D  H 0 : 1  2  D
Hypothesis    
 H1 : 1  2  D  H1 : 1  2  D  H1 : 1   2  D  H1 : 1  2  D
 s12 s22   s 2p s 2p 
Distribution of   12  22   s12 s22  x 
 1 2 x ,   x
 1 2 x ,  
 x1  x 2 ,    x1  x 2 ,    n1 n2  n n2 
x1  x 2 z-dtb:  n1 n2 
z-dtb:  n1 n2  t-dtb: t-dtb:  1

df 
 1  2  2
 12 22  (n1  1) s12  (n2  1) s22
 n  1  n  1 s 2p 
 1 2  , round down n1  n2  2
s12 s2 df  n1  n2  2
1  ; 2  2
n1 n2
x1  x 2  D x1  x 2  D x1  x 2  D x1  x 2  D
zt  zt  tt  tt 
Test statistic  2 2 2
s s 2 2
s s 2
1 1
1 2 1
 2 1
 2
s 2p   
n1 n2 n1 n2 n1 n2  n1 n2 

Critical Values  z 2  z 2 t / 2,df t /2,df


Reject Ho if : Reject Ho if : Reject Ho if : Reject Ho if :
Decision zt  z 2 or zt   z 2 zt  z 2 or zt   z 2 tt  t 2,df or tt  t 2,df tt  t 2,df or tt  t 2,df

 12  22 s12 s22 s12 s22 1 1


CI ( x1  x 2 )  z 
n1 n2
x x  z
1 2  
n1 n2
 x  x  t
1 2  /2, df 
n1 n2
 x  x  t
1 2  /2, df s 2p   
2 2  n1 n2 
1-TAILED TEST

std (σ1) and std (σ2) of 2 std (σ1) and std (σ2) of 2 populations are NOT GIVEN
populations are GIVEN
Equal Variance assumption is NOT Equal Variance assumption is
Sample size: n1, n2 Large sample (n1, n2 ≥30)
GIVEN & (n1, n2 <30) GIVEN & (n1, n2 <30)
 H 0 : 1   2  D( 1   2  D)  H 0 : 1  2  D( 1   2  D)  H 0 : 1  2  D( 1  2  D)  H 0 : 1   2  D( 1   2  D)
Hypothesis    
 H1 : 1  2  D( 1   2  D)  H1 : 1  2  D( 1   2  D)  H1 : 1  2  D( 1  2  D)  H1 : 1  2  D( 1  2  D)

 s12 s22   s 2p s 2p 
Distribution of   12  22   s12 s22   x1  x 2 ,    x1  x 2 ,  
x 
 1 2 x ,   x 
 1 2 x ,   n1 n2  n1 n2 
x1  x 2 n n2  n n2  t-dtb:  t-dtb: 
z-dtb:  1 z-dtb:  1

df 
 1  2  2
 12 22  (n1  1) s12  (n2  1) s22

 n  1 n  1 s 2p 
 1 2  , round down n1  n2  2
s12 s22 df  n1  n2  2
1  ; 2 
n1 n2
x1  x 2  D x1  x 2  D x1  x 2  D x1  x 2  D
zt  zt  tt  tt 
Test statistic  12  22 s12 s22 s12 s22 1 1
   s 2p   
n1 n2 n1 n2 n1 n2  n1 n2 

Critical Values z ( z ) z ( z ) t ,df (t ,df ) t ,df (t ,df )


Reject Ho if : Reject Ho if : Reject Ho if : Reject Ho if :
Decision zt   z ( zt  z ) zt   z ( zt  z ) tt  t ,df (tt  t ,df ) tt  t ,df (tt  t ,df )

 12  22 s12 s22 s12 s22 1 1


CI ( x1  x 2 )  z 
n1 n2
 
x1  x 2  z 
n1 n2
 
x1  x 2  t /2,df 
n1 n2
 x  x  t
1 2  /2, df s 2p   
2 2  n1 n2 
PROPROTION TEST

2-tailed Test 1-tailed Test p1  p2  D


1-tailed Test with

Sample size: Large sample (n1, n2 ≥30) Large sample (n1, n2 ≥30) Large sample (n1, n2 ≥30)

 H 0 : p1  p2  D  H 0 : p1  p2  D (p1  p2  D)  H 0 : p1  p2  D( p1  p2  D)
Hypothesis   
 H1 : p1  p2  D  H1 : p1  p2  D (p1  p2  D )  H1 : p1  p2  D ( p1  p2  D )
z-distribution: z-distribution: z-distribution:
Distribution of   1 1    1 1   pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 ) 
pˆ1  pˆ 2  pˆ1  pˆ 2 , pˆ (1  pˆ )      pˆ1  pˆ 2 , pˆ (1  pˆ )      pˆ1  pˆ 2 ,  
  n1 n2     n1 n2    n1 n2 

( pˆ1  pˆ 2 )  D ( pˆ1  pˆ 2 )  D ( pˆ1  pˆ 2 )  D


z z z
1 1 1 1 pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )
pˆ (1  pˆ )    pˆ (1  pˆ )    
Test statistic  n1 n2   n1 n2  n1 n2
x x x x x x
pˆ  1 2 pˆ  1 2 pˆ1  1 , pˆ 2  2
n1  n2 n1  n2 n1 n2

Critical Values  z 2  z ( z )  z ( z )
Reject Ho if : Reject Ho if : Reject Ho if :
Decision zt  z 2 or zt   z 2 zt   z ( zt  z ) zt   z ( zt  z )

1 1 1 1 p  z pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )
CI ( pˆ1  pˆ 2 )  z pˆ (1  pˆ )    ( pˆ1  pˆ 2 )  z pˆ (1  pˆ )     
2  n1 n2  2  n1 n2  2 n1 n2
VARIANCE TEST

2-tailed Test 1-tailed Test

Sample size: n1, n2 n1, n2

 H 0 :  1   2  H 0 :  1   2
2 2 2 2

Hypothesis  
 H1 :  1   2  H1 :  1   2
2 2 2 2

s12
2 F -distribution: k1  n1  1, k2  n2  1 F -distribution: k1  n1  1, k2  n2  1
Distribution of s2
s12 s12
Test statistic F F
s22 s22

Critical Values F1 / 2, k1 ,k2 or F /2,k1 ,k2 F ,k1 ,k2


Reject Ho if : Reject Ho if :
Decision
F  F1 / 2, k1 ,k2 or F  F /2,k1 ,k2 F  F ,k1 ,k2

Note

F1 ,k2 ,k1  1/ F ,k1 ,k2


Homework & Example

Question 1: A method for measuring the pH level of a solution yields a measurement value that is normally distributed with a mean
equal to the actual pH of the solution and with a standard deviation equal to .05. An environmental pollution scientist claims that two
different solutions come from the same source. If this were so, then the pH level of the solutions would be equal. To test the
plausibility of this claim, 10 independent measurements were made of the pH level for both solutions, with the following data
resulting.

A 6.24 6.31 6.28 6.30 6.25 6.26 6.24 6.29 6.22 6.28
B 6.27 6.25 6.33 6.27 6.24 6.31 6.28 6.29 6.34 6.27
Do the data disprove the scientist’s claim? Use the 5 percent level of significance?

Question 2: Twenty-five men between the ages of 25 and 30, who were participating in a well-known heart study carried out in
Framingham, Massachusetts, were randomly selected. Of these, 11 were smokers and 14 were not. The following data refer to
readings of their systolic blood pressure.

Smoke 124 134 136 125 133 127 135 131 133 125 118
NSmok 130 122 128 129 118 122 116 127 135 120 122 120 115 123
e
Use these data to test the hypothesis that the mean blood pressures of smokers and nonsmokers are the same.

Question 3: The viscosity of two different brands of car oil is measured and the following data resulted

Brand 1 10.62 10.58 10.33 10.72 10.44 10.74


Brand 2 10.50 10.52 10.58 10.62 10.55 10.51 10.53
Test the hypothesis that the mean viscosity of the two brands is equal, assuming that the populations have normal distributions with
equal variances.

Question 4: A sample of 10 fish was caught at Lake A and their PCB concentrations were measured using a certain technique. The
resulting data in parts per million were

Lake A: 11.5, 10.8, 11.6, 9.4, 12.4, 11.4, 12.2, 11, 10.6, 10.8

In addition, a sample of 8 fish was caught at Lake B and their levels of PCB were measured by a different technique than that used at
Lake A. The resultant data were

Lake B: 11.8, 12.6, 12.2, 12.5, 11.7, 12.1, 10.4, 12.6


If it is known that the measuring technique used at Lake A has a variance of .09 whereas the one used at lake B has a variance of .16,
could you reject (at the 5 percent level of significance) a claim that the two lakes are equally contaminated?

Question 5: The following are the values of independent samples from two different populations.

Sample 1 122 114 130 165 144 133 139 142 150
Sample 2 108 125 122 140 132 120 137 128 138
Let μ1 and μ2 be the respective means of the two populations. Find the p-value of the test of the null hypothesis

H0 : μ1 ≤ μ2 vs. H1 : μ1 > μ2. when the population standard deviations are σ1 = 10 and (a) σ2 = 5; (b) σ2 = 10; (c) σ2 = 20.

Question 6: Ten pregnant women were given an injection of pitocin to induce labor. Their systolic blood pressures immediately before
and after the injection were:

Patient 1 2 3 4 5 6 7 8 9 10
Before 134 122 132 130 128 140 118 127 125 142
After 140 130 135 126 134 138 124 126 132 144
Do the data indicate that injection of this drug changes blood pressure?

Question 7: A question of medical importance is whether jogging leads to a reduction in one’s pulse rate. To test this hypothesis, 8
nonjogging volunteers agreed to begin a 1-month jogging program. After the month their pulse rates were determined and compared
with their earlier values. If the data are as follows, can we conclude that jogging has had an effect on the pulse rates?

Subject 1 2 3 4 5 6 7 8
Before 74 86 98 102 78 84 79 70
After 70 85 90 110 71 80 69 74

Question 8: A pharmaceutical house produces a certain drug item whose weight has a standard deviation of .5 milligrams. The
company’s research team has proposed a new method of producing the drug. However, this entails some costs and will be adopted
only if there is strong evidence that the standard deviation of the weight of the items will drop to below .4 milligrams. If a sample of
10 items is produced and has the following weights, should the new method be adopted?

No 1 2 3 4 5 6 7 8 9 10
Value 5.728 5.731 5.722 5.719 5.727 5.724 5.718 5.726 5.723 5.722
Question 9: A gun-like apparatus has recently been designed to replace needles in administering vaccines. The apparatus can be set to
inject different amounts of the serum, but because of random fluctuations the actual amount injected is normally distributed with a
mean equal to the setting and with an unknown variance σ2. It has been decided that the apparatus would be too dangerous to use if σ
exceeds .10. If a random sample of 50 injections resulted in a sample standard deviation of .08, should use of the new apparatus be
discontinued? Suppose the level of significance is α = .10. Comment on the appropriate choice of a significance level for this problem,
as well as the appropriate choice of the null hypothesis.

Question 10: The production of large electrical transformers and capacitators requires the use of polychlorinated biphenyls (PCBs),
which are extremely hazardous when released into the environment. Two methods have been suggested to monitor the levels of PCB
in fish near a large plant. It is believed that each method will result in a normal random variable that depends on the method. Test the
hypothesis at the α = .10 level of significance that both methods have the same variance, if a given fish is checked 8 times by each
method with the following data (in parts per million) recorded

Method 1 6.2 5.8 5.7 6.3 5.8 6.1 6.2 5.7


Method 2 6.3 5.3 5.9 6.4 5.9 6.2 6.3 5.5

You might also like