This action might not be possible to undo. Are you sure you want to continue?

Hypothesis Testing: Two Sample Test for Means and Proportions

Introduction:

The two sample test is similar to the one sample test, except that we are now testing for differences between two populations rather than a sample and a population. There are two types of two sample tests:

Hypothesis Testing with Sample Means (Large Samples) Hypothesis Testing with Sample Means (Small Samples)

**The Question to be Answered:
**

³Is

the difference between sample statistics large enough to conclude that the populations represented by the samples are significantly different?´

An overview of Two-Sample Hypothesis Testing

You can begin by assuming that there is no difference in the mean times of the two populations. That is Q1 ± Q2 = 0. Then by taking a random sample from each population, and using the resulting twosample test statistic x x , you can perform 1 2 a two-sample hypothesis test. Suppose you obtain the following results.

Null Hypothesis:

The H0 is that the populations are the same.

H0:

1

=

2

If the difference between the sample statistics is large enough, or, if a difference of this size is unlikely, assuming that the H0 is true, we will reject the H0 and conclude there is a difference between the populations.

Null Hypothesis (cont.)

The H0 is a statement of ³no difference´ The 0.05 level will continue to be our indicator of a significant difference We change the sample statistics to a Z score,.

Alternate Hypothesis:

The alternate hypothesis is the research hypothesis. If the null hypothesis is rejected, then we will have found evidence to support the research hypothesis.

H1:

1

2

Tw o c i ti e s, Bradford and Kane are separated only by the River. There is competition between the two cities. The local

paper recently reported that the mean household income in Bradford is $38,000 with a standard deviation of $6,000 for a sample of 40 households. The same article reported the mean income in Kane is $35,000

with a standard deviation of $7,000 for a sample of 35 households. At the 0.01 significance level can we conclude the mean income in Bradford is more?

Step 4 State the decision rule. The null hypothesis is rejected if actual z is greater than critical z of 2.33 or if p .01 Step 1 State the null and alternate hypotheses. H0: µB < µK H1: µB > µK

Step 3 Find the appropriate test statistic. Because both samples are more than 30, we can use z as the test statistic.

Step 2 State the level of significance. The 0.01 significance level is stated in the problem.

Solution

Step 5: Compute the value of actual z and make a decision

z!

**$38,000 $35,000 ($6,000) ($7,000) 40 35
**

2 2

! 1.98

Actual Z

Because the actual Z = 1.98 < critical Z = 2.33, and since the p-value = 0.0239 > = 0.01 the decision is to accept the null hypothesis. Thus we cannot conclude that the mean household income in Bradford is larger then the mean household income in Kane.

H0: µB < µK H1: µB > µK

**Hypothesis Tests About Q 1 Q 2: W 1 and W 2 Unknown and Small Sample
**

When W 1 and W 2 are unknown, we will:

** use the sample standard deviations s1 and s2
**

as estimates of W 1 and W 2 , and

use t table instead of Z table. Assuming at least one of the sample is n < 30

Hypothesis Tests About Q 1 Q 2: W 1 and W 2 Unknown and Small Sample

Hypotheses

H 0 : Q1 Q2 u D0 H 0 : Q1 Q2 e D0 H 0 : Q1 Q2 D0 H a : Q1 Q2 { D0 a : Q1 Q 2 0 H a : Q 1 Q 2 " D0

LeftLeft-tailed

Right-tailed Right-

TwoTwo-tailed

Test Statistic (Actual t)³When n < 30 and W is (Actual t) unknown:

t!

( x1 x2 ) D0 s s n1 n2

2 1 2 2

Ex. 1: A Two-Sample z-test for the Difference Between Means

An advertising executive claims that there is a difference in the mean household income for credit card holders of Visa Gold and of MasterCard Gold. The results of a random survey of 100 customers from each group are shown. The two samples are independent. Do the results support the executive¶s claim? Use E = 0.05..

Solution:

You want to test the claim that there is a difference in the mean household incomes for Visa Gold and MasterCard Gold credit card holders. So, the null and alternative hypotheses are: Ho: Q1 = Q2 and Ha: Q1 { Q2 (Claim)

Solution:

Because the test is a twotailed test and the level of significance is E = 0.05, you look up the critical values and find they are -1.96 and 1.96. The rejection regions are z < 1.96 and z > 1.96. Because both samples are large, s1 and s2 are used to calculate the standard error.

Solution:

Using the z-test, the standardized test statistic is:

( x1 x2 ) ( Q1 Q 2 ) z! } W x1 x2

(60,900 64,300) (0) z! } 1.770 1921

Solution:

The graph at the left shows the location of the rejection regions and the standardized test statistic, z. Because z Is not in the rejection region, you should fail to reject the null hypothesis. At the 5% level, there is not enough evidence to conclude that there is a significant difference in the mean household incomes of Visa Gold and MasterCard Gold credit card holders.

Example: Hypothesis Testing in the Two Sample Case

Problem Information:

Sample 2 (W.Class) Sample 1 (M.Class)

'1

S1 N1

= 8.7 = 0.3 = 89

'2

= 5.7 S2 = 1.1 N2 = 55

Step 2 State the Null Hypothesis

H 0:

1

=

2

The Null asserts there is no significant difference between the populations.

H1: µ1 µ2

The research hypothesis contradicts the H0 and asserts there is a significant difference between the populations.

Step 3 Select the Sampling Distribution and Establish the Critical Region

Sampling Distribution = Z distribution Alpha ( ) = 0.05 Z (critical) = 1.96

Using the formula:

**Compute the pooled estimate (S.E.): W'' !
**

2 .32 1.12 s12 s2 ! ! .001 .022 ! .152 89 55 N1 N 2

Solve for Z:

' '
! 8.7 5.7 ! 19.74 Z!

1 2

W' '

.152

Step 5 Make a Decision

The obtained test statistic (Z = 19.74) falls in the Critical Region so reject the null hypothesis. The difference between the sample means is so large that we can conclude (at = 0.05) that a difference exists between the populations represented by the samples. The difference between the email usage of middle class and working class families is significant (Z=19.74, =.05)

Two-tailed Hypothesis Test:

Z= -1.96

Z = +1.96

c

c

Z=19.74 I

When = .05, then .025 of the area is distributed on either side of the curve in area (C ) The .95 in the middle section represents no significant difference between the two populations. The cut-off between the middle section and +/- .025 is represented by a Z-value of +/- 1.96.

Significance Vs. Importance

**When working with large samples, even small differences may be significant.
**

The value of the test statistic (step 4) is an inverse function of N. The larger the N, the greater the value of the test statistic, the more likely it will fall in the critical region (region of rejection) and be declared significant.

Significance Vs Importance

Significance and importance are different things. A sample outcome could be: significant and important significant but unimportant not significant but important not significant and unimportant

Next, we discuss hypothesis - testing regarding the mean of a normally distributed population for which W2 is unknown and the sample size is small (n < 30). This procedure is illustrated through the following example:

SOLUTION Hypothesis-Testing Procedure: i) We state our null and alternative hypotheses as H0 : Q = 66 and H1 : Q { 66. ii) We set the significance level at E = 0.05.

iii) Test Statistic: The test-statistic to be used is

t !

x Q s n

which, if H0 is true, has the t-distribution with n ± 1 = 9 degrees of freedom.

Important Note: As indicated in the previous discussion, we always begin by assuming that H0 is true. (The entire mathematical logic of the hypothesis-testing procedure is based on the assumption that H0 is true.)

iv) CALCULATIONS

Individual No. 1 2 3 4 5 6 7 8 9 10 Total xi 63 63 66 67 68 69 70 70 71 71 678 xi 2 3969 3969 4356 4489 4624 4761 4900 4900 5041 5041 46050

Now, And

So,

**678 ! ! 67.8 cm X! 10 n 2 n n « ¨ ¸ » ¬ n ( X i X )2 § © § Xi ¹ ¼ 1 ¬ 2 i !1 s ! X i2 ª i !1 º ¼ ! ¬§ ¼ n 1 n 1 i !1 n ¬ ¼ ¬ ¼ ½ 1 ! ?46050 45968.4A ! 9.0667 9
**

i !1

§X

n

i

s !

9 .0 6 6 7 ! 3 .0 1 c m

@

t! !

x Q0 s n

67.8 66 3.01 10

! 1.89

V) Critical Region:

Since this is a two-tailed test, hence the critical region is given by | t | > t0.025(9) = 2.262.

-2.262

REJECT

0 ACCEPT

2.262 REJECT

vi) Conclusion: Since the computed value of t = 1.89 does not fall in the critical region, we therefore do not reject H0 and may conclude that the mean height of the animals of this particular species is 66 centimeters.

t Test

t statistic (obtained) ± The test statistic computed to test the null hypothesis about a population mean when the population standard deviation is unknow and is estimated using the sample standard deviation. t distribution ± A family of curves, each determined by its degrees of freedom (df). It is used when the population standard deviation is unknown and the standard error is estimated from the sample standard deviation. Degrees of freedom (df) ± The number of scores that are free to vary in calculating a statistic.

Sign up to vote on this title

UsefulNot useful- ch11
- Hypothesis Testing
- Hypothesis
- Confidence Intervals and Hypothesis Testing
- ANOVA
- Tutorial Hypothesis Testing
- Anova
- Chapter 1
- Testing of Hypothesis
- Stat Infernc
- 10-1.1 Lesson Presentation
- Statistics for Business and Economics
- BioK IG Lab Marking Rubric
- Five Steps of Hypothesis Testing
- Testing of Hypothesis
- Statistical Hypothesis Testing
- 4_Science Fair Criteria for Success
- Quiz Ch7 Statistics Questions and Answers
- Hypothesis Testing 1
- Hypothesis Testing
- Statistical Inference
- Quantitive Market Research
- MB0050 - Research Methodology
- Chi Square Test
- Hypo
- mc16
- MB0050 - Research Methodolgy
- Experiment Design & Statistical Analyses for FYP Students_June 2015
- Chem 206 Lab Manual
- Hypothesis Testing