Making Decisions With Two or More Samples of Quantitative Data

HOSP 1207 (Business Stats) Learning Centre
Making Decisions with Two or More

Samples of Quantitative Data
This worksheet focuses on comparisons of two or more independent samples of
qualitative data. This means we’ll be working with proportions. Similar to the previous
worksheets, we are interested in finding out whether there is a significant difference in
the proportion of two populations.
Let’s say that a company wanted to do market research about wine club memberships.
A random sample of adult Vancouverites was surveyed. The survey results show that
out of 74 female respondents, 48 were members of a wine club. Out of 109 male
respondents, 66 were members of a wine club. Do the survey results provide evidence
that the men and women in Vancouver have differing tendencies to belong to wine clubs
at a 5% significance level?
The example above is a comparison of two independent samples of qualitative data.

There is no relationship between the members of the two samples and the samples are
different sizes.
The first step should be familiar: check that the two binomial sample distributions can be
approximated with the standard normal distribution. That way, we know the sampling
distribution of the difference ( ̂ ̂ is also approximately normally distributed. Since
we don’t have any information about the true population proportions, p1 and p2, we have
to use the sample estimates to check our conditions:
n1 is less than 5% of population 1, n1 ̂ 1 ≥ 10 , n1 1 ≥ 10 and

n2 is less than 5% of population 2, n2 ̂ 2 ≥ 10 , n2 2 ≥ 10
If these conditions are fulfilled, we can proceed.
There are two possible cases for creating a null hypothesis. We are either checking
whether the proportions in both populations are the same (i.e., the difference between
the proportions is zero), or we are checking to see whether one proportion is a particular
fixed number higher than the other.
In the first case, the null hypothesis is H0: p1 − p2 = 0. This is a special case where we
pool the data from the two samples to get a single estimate of the population
proportion:
̂
1 ̂
x1 and x2 are the number of successes in samples 1 and 2, and n1 and n2 are the total
number of individuals in samples 1 and 2. We pool the data because we are assuming
© 2013 Vancouver Community College Learning Centre. Authored by

by Emily
EmilySimpson
Simpson
Student review only. May not be reproduced for classes.
that the population proportions are equal.
The standard error of the sampling distribution ̂ ̂ when the data is pooled is:
̂
To calculate the z-score, we use:

̂ ̂
Note for (p1 – p2), we always plug in the number from the null hypothesis.
The second case is when the null hypothesis is equal to a non-zero number. In this
case, we do not pool the data. The standard error is calculated as below:
̂ ̂
The z-score calculation uses the same formula as above.
After calculating the z-score, compare to z(α) or z(α/2) to make the decision to reject or
fail to reject the null hypothesis. Alternately find the p-value of the z-score and compare
to α to make the decision.
Continuing with the wine club example:

The null hypothesis is H0: p1 – p2 = 0, so this is the special case where we pool the data.
The alternative hypothesis is H1: p1 – p2 ≠ 0.
First we check that the conditions for using the normal distribution apply.
Let the women be sample 1, and the men be sample 2.
n1 = 74, x1 = 48, ̂ 0.6486 n2 = 109, x2 = 66, ̂ 0.6055

Population of women in Vancouver must be > 74/0.05 = 1,480 9
Population of men in Vancouver must be > 109/0.05 = 2,180 9
̂ 48 ≥ 10 9
26 ≥ 10 9
̂ 66 ≥ 10 9
43 ≥ 10 9
The two binomial sampling distributions can be approximated with standard normal
distributions. Now we can calculate the single estimate of the population proportion, the
standard error and z-score.
48 66
̂ 0.6230
74 109
1 ̂ 1 0.6320 0.3770
© 2013 Vancouver Community College Learning Centre.

Student review only. May not be reproduced for classes. 2
The standard error:
̂ 0.6230 0.3770 0.0730
The z-score:
̂ ̂ 0.6486 0.6055 0
0.59
0.0730
The p-value is 0.2776; we multiply by 2 (for a two-tailed test) to get 0.5552.
If we compare the z-score to z(α/2) = 1.96, our z-score is less than z(α/2), so we fail to
reject the null hypothesis.
Alternately compare the p-value of 0.5552 to α of 0.05: since our p-value > α, we fail to
reject the null hypothesis. We conclude that there is insufficient evidence that men and
women have differing tendencies to belong to wine clubs.
Exercises
1. An airline company is interested in reducing the percentage of customers with

lost baggage. It implements a new check-in/luggage handling procedure. Before
the changes, the airline took a random sample of 1000 customers and found that
30 of them reported lost bags. After the new procedures, a sample of 1600
customers showed that 31 customers reported lost bags. Is there evidence at the
4% significance level that the proportion of customers with lost baggage
decreased after the policy change? The airline handles luggage for thousands of
customers every day.
2. A 2010 survey showed that the percentage of Canadian post-secondary students

with a blog had increased to 48%, up from 32% in a study done in 2008. In 2008,
750 Canadian post-secondary students were surveyed. In 2010, 700 Canadian
post-secondary students were surveyed. Based on the survey can we conclude
that the percentage of Canadian post-secondary students with a blog increased
by more than 10% in the 2 years between the surveys? Use a 2.5% level of
significance.
3. A quality control inspector wants to test whether the proportion of cans of tomato
paste meeting the required sugar content from two different manufacturers is
equal. A sample of 250 cans from Manufacturer A revealed 237 cans met the
required sugar content. From a sample of 234 cans from Manufacturer B, 209
met the required sugar content. At the 10% significance level, is there any
evidence of a difference in the proportion of cans that meet the required sugar
content from Manufacturers A and B?
Solutions

1. H0: p1 – p2 = 0 H1: p1 – p2 > 0
p1 is the sample proportion before the changes, p2 is the sample proportion after
the changes. Data will be pooled. Conditions for approximating binomial
distribution with standard normal distribution are met.
̂ 0.0235, 0.9765
std error = 0.006102
z-score = 1.74, p-value = 0.0409
z(α) = 1.751
Since the p-value is greater than α, we fail to reject the null hypothesis. There is
insufficient evidence to conclude that the changes implemented by the airline
reduced the proportion of customers with lost luggage.
2. H0: p1 – p2 = 0.10 H1: p1 – p2 > 0.10

p1 is the 2010 survey proportion and p2 is the 2008 survey proportion. Data will
NOT be pooled. Conditions for approximating binomial distribution with standard
normal distribution are met.
̂ 0.48, 0.52, n1 = 700, ̂ 0.32, 0.68, n2 = 750
std error = 0.02543
z-score = 2.36, p-value = 0.0091
z(α) = 1.96
Since the p-value is less than α, or alternately since z-score is greater than z(α),
we reject the null hypothesis. There is sufficient evidence to conclude that the
percentage of Canadian post-secondary students with a blog has increased
since the last survey (2008).
3. H0: p1 – p2 = 0 H1: p1 – p2 ≠ 0
p1 is Manufacturer A’s proportion and p2 is Manufacturer B’s proportion. Data will
pooled. Conditions for approximating binomial distribution with standard normal
distribution are met.
̂ 0.9215, 0.0785
std error = 0.0245
z-score = 2.24, p-value = 0.0125 → multiply p-value by 2: 0.0250
z(α/2) = 1.645
Since the p-value is less than α, or alternately since z-score is greater than z(α/2),
we reject the null hypothesis. There is sufficient evidence to support that the
proportion of cans from Manufacturer A that meet the required sugar content is
different than the proportion of cans from Manufacturer B that meet the required
sugar content.


Making Decisions With Two or More Samples of Quantitative Data

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Making Decisions With Two or More Samples of Quantitative Data

Uploaded by

Copyright:

Available Formats

HOSP 1207 (Business Stats) Learning Centre

Making Decisions with Two or More

The example above is a comparison of two independent samples of qualitative data.

n1 is less than 5% of population 1, n1 ̂ 1 ≥ 10 , n1 1 ≥ 10 and

If these conditions are fulfilled, we can proceed.

© 2013 Vancouver Community College Learning Centre. Authored by

To calculate the z-score, we use:

The z-score calculation uses the same formula as above.

Continuing with the wine club example:

n1 = 74, x1 = 48, ̂ 0.6486 n2 = 109, x2 = 66, ̂ 0.6055

© 2013 Vancouver Community College Learning Centre.

The p-value is 0.2776; we multiply by 2 (for a two-tailed test) to get 0.5552.

1. An airline company is interested in reducing the percentage of customers with

2. A 2010 survey showed that the percentage of Canadian post-secondary students

© 2013 Vancouver Community College Learning Centre.

2. H0: p1 – p2 = 0.10 H1: p1 – p2 > 0.10

© 2013 Vancouver Community College Learning Centre.

You might also like