Est Diff Mean

Estimation of Difference of
Means
Business Statistics/ Statistical
Inference
Case of Two Independent Populations
The Concept
Consider a random sample mean for some random
sample 1 belonging to a sampling distribution of
mean of population 1. The expected value of which
is the mean of the population 1, for which the
sampling distribution of mean is made. The
variance of is given by i.e. where is the standard
deviation of sample 1 from population 1 and is the
sample size of the sample 1 from this sampling
distribution of mean .
The Concept
Consider another random sample mean for some
random sample 2 belonging to a sample distribution
of mean of population 2. The expected value of
which is the mean of the population 2 for which the
sampling distribution of mean is made. The variance
of is given by i.e. where where is the standard
deviation of sample 2 from population 2 and is the
sample size of the sample 2 from this sampling
distribution of mean .
The Concept
• The two situations can be depicted as follows
Sampling distribution
of mean=
𝜇1 𝑋1
Sampling distribution
of mean =
𝜇2 𝑋2
Population Cloud 1
Sample 1
𝑋1
𝑥1 𝜇1
𝑠 22 𝑠2
𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 2= ⇒𝑆𝐸𝑀 2=
𝑁2 √𝑁2
Population Cloud 2
𝑥2 𝜇2 Sample 2
𝑋2
±
+ Population Cloud 3
𝜇3=𝜇1±𝜇2 + is used when two sampling distribution are added
- is used when two sampling distribution are subtracted

Whether sampling
distributions are added
or subtracted, law of
variances states that
they are always
summed
+
Estimating Population Mean
• Remember that for one sample the estimation of is given as,
=
Here value is called critical value for a CI or
• Lets we have means from sampling distribution 1 and sampling distribution

2 and they are and respectively.
• If the two sampling distributions are added to obtain then the estimation is
given using two sample means from two separate population as follows,
Here value is called critical value for a CI or or
• If the two sampling distributions are subtracted to obtain then the

estimation is given using two sample means from two separate population
as follows, Here value is called critical value for a CI or or
General Format of Interval Estimation
• The general form of interval estimated value using some confidence interval CI:
Population estimateestimated value critical value std.dev of the estimate

or
Population estimateestimate value margin of error
For example:
Population estimatesample mean critical value standard error of mean
Standard error of mean (SEM) is the standard deviation of sampling distribution of

mean
General Format
Confidence
Interval (CI)
Or
interval
𝛼 𝛼
2 2
−𝑡 𝑐 𝑡=0 +𝑡 𝑐
−𝑡 𝛼 +𝑡 𝛼
2 2
is also called the level of significance and it is always equal to 1-CI

Example
• Let
Confidence
Interval (CI)
Or
interval=
𝛼 𝛼
2 2
−𝑡 𝑐 𝑡=0 +𝑡 𝑐
−𝑡 0.025 +𝑡 0.025
Two cases and DF
• In independent populations case, with reference
to SEM we can have two cases
– Equal variances
– Unequal variances
Case of Equal variances
• If sample variances are equal then equation
can be simplified from
and u=, to,
The degree of freedom or DF in this case is

calculated as follows,
Case of Equal Variances
• However it does not happen that we have exact same values of variances,
rather we can have an approximate.
• An accepted rule is that equal variances can be assumed if ratio of sample

standard deviations satisfy:
• Further more, in this case we have to pool or group the variances as one
single value. This pooled standard deviation or (as it is called) is given by,
and hence the estimation formula becomes,

• If the two sampling distributions are added to
obtain then the estimation is given using two
sample means from two separate populations as
follows,
Here value is called critical value for a CI or with DF=
• If the two sampling distributions are subtracted to

sample means from two separate population as
follows,
Here value is called critical value for a CI or with DF=
Case of Unequal Variances and DF
• When condition of equal variances is not
satisfied then we have no change in the
estimation formula just the degree of freedom
or DF is calculated differently.
• This new formula of DF is given as follows:

• If the two sampling distributions are added to
sample means from two separate population as
follows,
Here value is called critical value for a CI or with DF as above
• If the two sampling distributions are subtracted

to obtain then the estimation is given using
two sample means from two separate
population as follows,
Here value is called critical value for a CI or with DF as above
Case of One Same Population
The Concept
• Consider the case that we take two samples from the same sampling
distribution of mean of single same population.
• The difference of the sample values will make a new difference dataset that
will estimate a new difference population as follows:
Here
is the mean of a new data set which contains differences of measured values
from two samples taken from same single population.
is the standard deviation of new differences dataset.
is the sample size of new differences dataset.
is the critical value at some CI or level of significance with
Find the point to point difference
of two samples and this will form
Population Cloud new difference dataset, . Mean
of all such similar difference
Sample 1 Sample 2 dataset will form Sampling
Distribution of
Difference of Means=
Sampling Distribution of
Difference of Means=
𝜇𝑑 𝑋𝑑
𝑥𝑑
Summary of Formulae for Estimation of
Difference of Means
S. No Formula to Use Condition
When we have two samples
from two separate
1 populations and the two
samples have equal
variances.

2 from two separate
populations and the two
samples have unequal
variances.

3 from one single and same
population.
Comments
• The first two cases are similar in the fact that they involve two
separate populations and two samples means. The two sample
means , their sample variances and sample standard deviations
are calculated separately. They are then used in the estimation
formula.
– This is also called independent means or between groups case
• The last case is different from the first two. Firstly note that
population is same and secondly the sample values are
subtracted and this gives a new differences dataset. After
obtaining this new differences dataset, its sample mean, variance
and standard deviation are used in the estimation formula.
– This is also called dependent means or paired groups case or repeated
measures
Examples
Estimation of Difference of Means
Example 1
• In a packing plant, a machine packs cartons with jars. It is supposed that a new machine will
pack faster on the average than the machine currently used. An experiment was conducted
to record the time taken to pack 10 cartons by new and present/ old machine. The results in
seconds, are shown in the following table. Give 99% CI for the difference between the mean
time it takes the new machine to pack 10 cartons and the mean time it takes the old/ present
machine to pack 10 cartons.
Sample 1(New Machine) Sample 2 (Old machine)
42.1 42.7
41.0 43.6
41.3 43.8
41.8 43.3
42.4 42.5
42.8 43.5
43.2 43.1
42.3 41.7
41.8 44.0
42.7 44.1
and and
Example 1
• Here two samples from two separate populations are taken
– Therefore it is independent means case
• Check of variances is made by using ratio of sample standard

deviations as follows:
• This is the case of equal variances.
• Hence the difference between the mean time it takes the new
machine to pack 10 cartons and the mean time it takes the old/
present machine to pack 10 cartons can be estimated by using,
and first finding as,

Example 1
and secondly by finding value as;
For value we need to know two things:

1) CI=> 99% (given) at two tails
2) DF which is
Using table we get
Example 1
Example 1
Finally,
The 99% CI is
i.e. we are 99% confident that is between

Example 1
• In the context of the question, it means that
estimation mean difference of packing time
between two machines at 99% CI lies in the interval
(in seconds)
Negative sign shows that new machine is faster then
present/ old machine through this interval.
New machine can pack 2.01 s to 0.17 s earlier than

old/ present machine.
Example 2
• Independent random samples of 17 sophomores and 13 juniors attending a
large university yield the following data on grade point averages. At 5% level
of significance find the estimation of the difference of two population means
corresponding to sophomores and juniors respectively. Take the case of
unequal variances.
Example 2
• Here two samples from two separate populations are taken
– Therefore it is independent means case
• The sample statistics for two samples from two populations sophomores and juniors written
with subscripts 1 and 2 respectively are as follows:
• It is provided in the question that this is the case of unequal variances.
• Hence the difference between the mean GPAs is estimated as follows:

Example 2
Which gives
For value we need to know two things:

1) CI=> 95% (given) at two tails
2) DF which we need to calculate
Example 2
• DF is calculated as follows:
( )
2 2 2
𝑠1 𝑠2
+
𝑁1 𝑁2
𝐷 𝐹=
( ) ( )
2 2 2 2
1 𝑠1 1 𝑠2
+
𝑁 1−1 𝑁1 𝑁 2− 1 𝑁2
• After obtaining DF we can find required value from the table as follows:
Example 2
Example 2
• Hence the value is 2.056.
• Therefor the calculation is proceeded as follows:
At 95% CI Sophomores have GPA lesser by 0.4437

and greater by 0.1637 as compared to Juniors.
Example 3
• Trace metals in drinking water affect the flavor and an unusually
high concentration can pose a health hazard. Ten pairs of data
were taken measuring zinc concentration in bottom water and
surface water. Provide a 95% CI estimation of mean for the
difference of mean zinc concentration between bottom water
and surface water. The data obtained is as follows:
Example 3
• Here two samples are taken from same
population.
• We first find new difference dataset as follows.

Example 3
• For reference the new difference dataset and its statistics.
Sample 1 Sample 2 Differences
Dataset
0.430 0.415 0.015

0.266 0.238 0.028
0.567 0.390 0.177
0.531 0.410 0.121
0.707 0.605 0.102
0.716 0.609 0.107
0.651 0.632 0.019
0.589 0.523 0.066
0.469 0.411 0.058
0.723 0.612 0.111
Example 3
• The mean of it is
• The standard deviation of the new difference

dataset is =0.0523
• The DF is
• At 95% CI at two tails and DF=9 gives (as shown in

the following table)
Example 3
Example 3
• Hence the interval estimation of difference of
means, , is obtained as follows,
At 95% CI zinc content is 0.043 to 0.1176 is higher

in bottom surface to top surface.

Est Diff Mean

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Est Diff Mean

Uploaded by

Copyright:

Available Formats

Estimation of Difference of

𝜇3=𝜇1±𝜇2 + is used when two sampling distribution are added

- is used when two sampling distribution are subtracted

• Lets we have means from sampling distribution 1 and sampling distribution

Here value is called critical value for a CI or or

• If the two sampling distributions are subtracted to obtain then the

Population estimateestimated value critical value std.dev of the estimate

Population estimatesample mean critical value standard error of mean

Standard error of mean (SEM) is the standard deviation of sampling distribution of

is also called the level of significance and it is always equal to 1-CI

and u=, to,

The degree of freedom or DF in this case is

• An accepted rule is that equal variances can be assumed if ratio of sample

and hence the estimation formula becomes,

Here value is called critical value for a CI or with DF=

• If the two sampling distributions are subtracted to

• This new formula of DF is given as follows:

• If the two sampling distributions are subtracted

When we have two samples

When we have two samples

• Check of variances is made by using ratio of sample standard

• This is the case of equal variances.

and first finding as,

and secondly by finding value as;

For value we need to know two things:

i.e. we are 99% confident that is between

New machine can pack 2.01 s to 0.17 s earlier than

• It is provided in the question that this is the case of unequal variances.

• Hence the difference between the mean GPAs is estimated as follows:

For value we need to know two things:

At 95% CI Sophomores have GPA lesser by 0.4437

• We first find new difference dataset as follows.

0.430 0.415 0.015

• The standard deviation of the new difference

• At 95% CI at two tails and DF=9 gives (as shown in

At 95% CI zinc content is 0.043 to 0.1176 is higher

You might also like