You are on page 1of 56

TOPIC 2: ESTIMATION

Miss Nur 'Ainina binti Awang


Faculty of Computer and Mathematical
Sciences
UiTM Shah Alam

013-3547250
ainina@tmsk.uitm.edu.my
Contents

Sampling Distribution

• Introduction
• Sampling Distribution of the Sample Mean, x
• Central Limit Theorem (CLT)
• Probability Distribution of the Sample Mean, x

Estimation
• Introduction
• Point Estimation
• Interval Estimation (Confidence Interval)
• One Population Mean
• Two Population Means
• One Population Variance
• Ratio of Two Population Variances
Sampling Distribution: Introduction
• The following are some required definitions:
Term Definition
Population All elements under study either living or non-living object
Sample Subset or part of population
Population parameter A summary measure/characteristics obtained from population
Sample statistic A summary measure/characteristics obtained from sample
Population distribution The probability distribution of the population data
Sampling distribution The probability distribution of a sample statistics

• In
statistical inference, we are interested in making an inference or generalization or conclusions concerning a
population based on sample information.
Sampling Distribution: Introduction
• Example of numerical descriptive measures are mean, standard deviation and variance. The formulas are as in the following
table:
Population (Parameter) Sample (Statistics)

X x
Mean μ= x=
N n

1 X 2 1 x 2
Standard deviation σ= X2 − s= x2 −
N N n−1 n

2 2
2
1 2
X 2
1 2
𝑥
Variance σ = X − s = x −
N N n−1 n

• Example 2.1 : A new battery of Model XY has a life span with a mean of 40 months and standard deviation of 4 months. A
sample of 50 batteries selected showed an average life span of 39 months and standard deviation of 4.2 months.
• Therefore, μ = 40, σ =4, x=39, s=4.2
Sampling Distribution: Sampling Distribution of the Sample Mean, x
• A sampling distribution of sample means is a distribution obtained by using the means computed from random samples of a
specific size taken from a population.

• Example 2.2: If a lecturer gives a 10 point quiz to a population of 4 students. The results of the quiz were 2, 4, 6 and 8.

Summary Measures for Population Distribution

Population probability distribution: x P(x)


2 1/4
4 1/4
6 1/4
8 1/4
X 2 + 4 + 6 + 8 20
μ= = = =5
N 4 4

1 X 2 1 20 2
σ= X2 − = 120 − = 2.2361
N N 4 4
Sampling Distribution: Sampling Distribution of the Sample Mean, x
• Example 2.2: If a lecturer gives a 10 point quiz to a population of 4 students. The results of the quiz were 2, 4, 6 and 8.

• Now, take a sample of size 2 with replacement and find the mean of each sample.
2 4 6 8 2 4 6 8

2 2,2 2,4 2,6 2,8 2 2 3 4 5


Mean 4 3 4 5 6
4 4,2 4,4 4,6 4,8
6 6,2 6,4 6,6 6,8 6 4 5 6 7

8 8,2 8,4 8,6 8,8 8 5 6 7 8

Summary Measures for Sample Mean Distribution


x P(x)
2 + 3 + 4 + ⋯ + 8 80 Sampling probability distribution:
μx = = =5 2 1/16
16 16
3 2/16
1 80 2 4 3/16
σx = 440 − = 1.5812
16 16 5 4/16
6 3/16
7 2/16
8 1/16
Sampling Distribution: Sampling Distribution of the Sample Mean, x
• Example 2.2: If a lecturer gives a 10 point quiz to a population of 4 students. The results of the quiz were 2, 4, 6 and 8.
Population Distribution Sampling Distribution of 𝐱
N=4 n=2
μ=5 μx =5
σ=2.2361 σx =1.5812
P(X) P(X)
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.05
0 0
2 4 6 8 X 2 3 4 5 6 7 8 X
Conclusion:
• The mean of the sampling distribution of x is denoted by μX will be the same as the population mean, μ
μX = μ
• The standard deviation of the sampling distribution of x is denoted by σX is
σ
σX =
n
Where σ is standard deviation of the population and n is the sample size
Sampling Distribution: Central Limit Theorem (CLT)
• CLT states that the sampling distribution of any statistic will be normal or approximately normal if the sample is large.

• When to use the Central Limit Theorem consider two things:


• If the original population is normally distributed or approximately normal, then the distribution of the sample mean
will be normally distributed for any sample size n.

• If the original population is not normally distributed, the distribution of the sample mean will be normally distributed
for a sample size of 30 or more. The normal and not normal population distributions together with their respective
sampling distributions of 𝑥 for different sample sizes, 𝑛.

Rule of thumb:
n ≥ 30
(consider large)
Sampling Distribution: Probability Distribution of the Sample Mean, x

• A sampling distribution of sample means is a distribution obtained by using the means computed from random samples of a
specific size taken from a population.

The Central Limit Theorem (CLT) on the Distribution of the Sample Mean, 𝒙

• The properties of the distribution of sample means:

The mean of the sample means will be the same as the population mean, μx = μ.

The standard deviation (standard error) of the sample means will be smaller than the
standard deviation of the population, and it will be equal to population standard
σ
deviation divided by the square root of the sample size, σx = n

• Note: If the population is not normally distribution or there is no information regarding the population, then the distribution
of the sample means tends to be normally distributed when the sample size is sufficiently large. That is, when n ≥ 30.
Sampling Distribution: Probability Distribution of the Sample Mean, x

Notation
σ2
X~ N μ, n

Formula
X−μ
Sampling distribution of
Z= σ
n
the Sample Mean, 𝑿 X =sample mean
μ=population mean
(CLT) σ=population standard deviation
n=sample size

Mean, μX = μ

2 σ2
Variance, σ X =
n
σ
Standard error
Standard Deviation, σX = of the mean
n
Sampling Distribution: Probability Distribution of the Sample Mean, x

Without Sample With Sample

Notation σ2
X~N(μ, σ2 ) X~N(μ, )
n

Formula x−μ x−μ


z= z=
σ σ
n
Mean μX = μ μX = μ
(Expected Value)
Variance σ2 X = σ2 2
σ2
σ X =
n
Standard
deviation σX = σ σ
σX =
(Standard Error) n
Sampling Distribution: Probability Distribution of the Sample Mean, x
Example 2.3: A production firm manufactures light bulbs that have a length of life that is approximately normally distributed
with mean 800 hours and a standard deviation of 40 hours.
a) Write the probability distribution of the life of the light bulbs.
b) Write the sampling distribution of the mean life of the light bulbs.
c) Find the probability that the life of the light bulbs is more than 850 hours. (Ans: 0.1056)
d) Find the probability that a random sample of 16 bulbs will have an average life of less than 775 hours. (Ans: 0.00621)
Sampling Distribution: Probability Distribution of the Sample Mean, x
Example 2.4: A manager observes that his income per day averages RM1000 with standard deviation of RM200. He selected
a random sample of 30 days.
a) Describe the distribution of the sample mean.
b) What is the probability that the mean income for the sample of 30 days exceeds RM1050. (Ans: 0.0853)
c) What is the probability that the sample mean will be within RM 100 of the population mean. (Ans: 0.99386)
Sampling Distribution: Probability Distribution of the Sample Mean, x
Example 2.5: The actual weight, W kilograms, of fertilizer in a 5 kg bag may be modelled by a normal random variable with
mean 5.25 kg and variance 0.25 kg. A random sample of four 5 kg bags is selected. Calculate the probability that the mean
weight of fertilizer of the four bags is less than 5.30 kg. (Ans: 0.5871)
Estimation: Introduction
The following are some required definitions:
Term Definition
Estimation A procedure by which a numerical value are assigned to a population parameter based on
the information collected from a sample.
Estimate The value(s) assigned to a population parameter based on a value of a sample statistic
Estimator The sample statistic used to estimate a population parameters
If we have two or more estimators for the same parameters, we need to compare them and choose the best estimator. There
are three properties of best estimators, namely unbiased, efficient, and consistent.
i. The estimator should be unbiased estimator. That is, the expected value or the mean of the estimates obtained
from sample of a given size is equal to the parameter being estimated. E θ = θ

ii. The estimator should be relatively efficient estimator. That is, of all the statistics that can be used to estimate a
parameter, the relatively efficient estimator has the smallest variance. θ1 is more efficient estimator of θ than θ2 if
Var θ1 < Var θ2

iii. The estimator should be consistent. For a consistent estimator, as sample size increases, the value of the estimator
approaches the value of the parameter estimated. lim Var θ = θ
𝑛→∞
Estimation: Introduction
• Estimation refers to the process by which one makes inferences about a population, based on information obtained
from a sample.

• There are two types of estimation:

Point Estimation (single value) Interval Estimation (in range)

• Point estimation is a single value calculates from the sample • Interval estimation is an interval or a range of values used
data used to estimate the population parameter. to estimate the parameter. This estimate may or may not
contain the value of the parameter being estimated.
• Also known as confidence interval.

For example, suppose we want to estimate the mean income of workers in Company A. For n=25 workers,
• The mean income, =RM2500/month • The mean income is between RM2300 and RM2700/month.
Estimation: Point Estimation
• Point estimator: A single number calculated from the sample to estimate the population parameter.
• To generalize the estimation to the population, the sample must be a random sample. A random sample is a sample
which each element in the population has an equal chance to be included in the sample.
• The following table indicates the best point estimator for each parameter:

Population (Parameter) Sample (Statistic /Point Estimator)


μ x
x=
n

1 x 2
σ s= x2 −
n−1 n

2
1 𝑥
σ2 2
s = 2
x −
n−1 n

x
p p=
n
Estimation: Point Estimation
Example 2.6: The total time for exercise in a week among 8 career women is selected. The resulting observations are 10.2
9.3 11.9 9.2 8.3 11.2 10.4 9.5. What are the point estimates of mean and standard deviation of exercise time?
(Ans: 10, 1.1662)
Estimation: Interval Estimation
• Interval estimation: two numbers calculated from the sample to form an interval within which the parameter is expected to
lie with a specified level of confidence.
• The interval is constructed around the point estimate. This interval estimate is also known as confidence interval.
• We can write the confidence interval for a parameter θ as:

P(a < θ < b)=1-a

Notation: a=the lower limit of the interval


b=the upper limit of the interval
1-a=the confidence coefficient
(1-a)100%=the confidence level
• The confidence level measures the probability that the interval contains the parameter being estimated. If you construct a
95% confidence interval, this means the confidence coefficient is 0.95 and the confidence level is 95%.
Confidence Interval (CI)

One Population Mean Population Mean Two Population Means

Independent Samples Dependent Samples


σ2 known

x ± zα σ σ2 known sd
2 n d ± t α,n−1
𝐏𝐨𝐢𝐧𝐭 𝐞𝐬𝐭𝐢𝐦𝐚𝐭𝐞 ± 𝐦𝐚𝐫𝐠𝐢𝐧 𝐨𝐟 𝐞𝐫𝐫𝐨𝐫 2 n
σ12 σ22 d
σ2 unknown (x1 −x2 ) ± zα +
n1 n2
d=
n
2
Construct a 95% confidence interval for the
1 d 2
population means. (One Population) sd = d2 −
s
x ± t α,n−1 n
n−1 n
2 Construct a 95% confidence interval for the
difference of the population means. (Two
Population) σ2 unknown

σ12 = σ22 (Equal variances) σ12 ≠ σ22 (Unequal variances)


2
s12 s22
1 1 n1 + n2
(x1 −x2 ) ± t α, df sp + n1 − 1 s12 + (n2 − 1)s22 s21 s22 df =
2 n1 n2 Sp =
n1 + n2 − 2
(x1 −x2 ) ± t α, df + , s12
2
s22
2
2 n1 n2
df = n1 + n2 − 2 n1 n2
+
n1 − 1 n2 − 1
Confidence Interval (CI)

Population Variance
One Population Ratio of Two Population
σ21
s21 1 𝜎12 s21
σ2 < < Fα,v σ22
s22 Fα 𝜎22 s22 2 2 ,v1
,v ,v
2 1 2
n−1 s2 2 n−1 s2 n−1 s2 n−1 s2 s21 1 s21
<σ < @ , 2 @ , s2 Fα,v
χ2α χ2 α χ2α χ α s22 Fα 2 2 2 ,v1
2 ,v1 ,v2
2 ,n−1 1− 2 ,n−1 2 ,n−1 1− 2 ,n−1

σ1
σ σ2
s21 1 𝜎1 s21
< < Fα,v
s22 Fα 𝜎2 s22 2 2 ,v1
,v1 ,v2
n−1 s2 n−1 s2 n−1 s2 n−1 s2 2
<σ< @ ,
χ2α χ2 α χ2α χ2 α
2 ,n−1 1− 2 ,n−1 2 ,n−1 1− 2 ,n−1 s21 1 s21
@ 2 , Fα,v
s2 Fα s22 2 2 ,v1
,v1 ,v2
2

21
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Known)

Assumption

• Random sample
One Population
• The population is normal distributed
Mean and
• Small (n<30) or large (≥30) sample
known σ2
• σ2 /σ is known

Formula

The (1-α) 100% confidence level for population mean, μ is


σ
x ± zα
2 n

Confidence interval of mean for a specific α when σ2 /σ is known:


σ σ
x − zα < µ < x + zα
2 n 2 n
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Known)
Example 2.7: The average lifetime of a product from a sample of 30 items is found to be 48 months. It is known that the
standard deviation of the population is 3 months and the population of such lifetime is normal.
a) What is the point estimate of the mean lifetime off all such products?
b) Construct the 95% confidence interval for the average lifetime of the product and interpret the interval. (Ans: (46.9265,
49.0735))
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Known)
Example 2.8: Now, repeat the same problem by finding the 90% confidence interval for the average lifetime.
(Ans: (47.0991, 48.9009))

Note: The width of the confidence interval depends on the size of the margin of error which depends on the values of z, σ and n.
However, the value of σ is beyond our control. Therefore, the width of the confidence interval can be controlled either through the
value of z (depends on α) or the size of the sample, n.
Confidence interval and the width of confidence interval
- The larger the confidence level, the wider the confidence interval is and vice versa.
Sample size and the width of confidence interval
-The bigger the size of the sample, the smaller the confidence interval is and vice versa.
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Known)
Example 2.9:The following data represent a sample of assets (in millions of RM) of 10 companies in Selangor. Find the 95%
confidence interval of the mean. Assume that the assets (in millions of RM) of all companies in Selangor are approximately
normally distributed and the standard deviation of the population is 21.154
12.23 2.89 13.19 73.25 11.59 8.74 7.92 40.22 5.01 2.27

Below is the output for the analysis of data in Example 2.9 using Minitab software.
One-Sample Z: Assets_Value

The assumed standard deviation = 21.154

Variable N Mean StDev SE Mean 95% CI


Assets_Value 10 17.73 22.30 6.69 (4.62, 30.84)
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Unknown)

Assumption

One Population • Random sample


Mean and • The population is normal distributed
unknown σ2 • Small (n<30) or large (≥30) sample
• σ2 /σ is unknown

Formula

The (1-α) 100% confidence level for population mean, μ is


s
x ± t α,n−1
2 n

Confidence interval of mean for a specific α when σ2 /σ is known:


s s
x − t α,n−1 < µ < x + t α,n−1
2 n 2 n
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Unknown)
Example 2.10: The time taken (in seconds) to connect to the internet via a dial-in service for a sample of 35 nights gave a
mean of 26.46 and a standard deviation of 10.81. Find a 98% confidence interval on the mean time required to access the
internet during the night and interpret the interval. (Ans: (21.9705, 30.9495))
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Unknown)
Example 2.11: An explorer finds a new species of beetle. He captures and weighs 41 of them. The mass, m grams of each
beetle in this sample is recorded giving 𝑚 = 84.05 and a standard deviation of 1.412 grams. Find the 99% confidence
interval for the mean mass of all the beetles and interpret the interval. (Ans: (1.4538, 2.6462))
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Unknown)
Example 2.12: A pharmaceutical manufacturer purchases a particular material from a supplier. The manufacturer selects
nine shipments from the supplier and measures the percentage of impurities in the raw material from each shipment. The
sample means and variances are and 1.89 and 0.273. Find a 90% confidence interval for and interpret your results.
(Ans: (1.5661, 2.2139))
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Unknown)
Example 2.13:The following data represent a sample of assets (in millions of RM) of 10 companies in Selangor.
12.23 2.89 13.19 73.25 11.59 8.74 7.92 40.22 5.01 2.27
Assume that the assets (in millions of RM) of all companies in Selangor are approximately normally distributed. The output
for the analysis of data in Example 2.13 using Minitab software. (where σ is unknown)
One-Sample T: Assets_Value

Variable N Mean StDev SE Mean 95% CI


Assets_Value 10 17.73 22.30 7.05 (1.78, 33.68)

a) Prove that the mean assets (in millions of RM) is 17.73.


b)Prove that the 95% confidence interval of the true mean is the same as the computer output.
c)Based on the confidence interval, can we conclude that the mean assets (in millions of RM) of all companies in Selangor
are differ?
Estimation: Interval Estimation: One Population Mean, μ (Variance σ2 Unknown)
Example 2.13:
Estimation: Interval Estimation: Two Population Means

Types of Interval Estimation / Confidence Interval for Two Population Means

Independent Sample Dependent Sample

• Two samples are independent if they are draw from • Two samples are dependent if they are draw from
two different populations and the elements of first two different populations and the elements of first
sample have no relationship to the elements of the sample have relationship to the elements of the
second sample. second sample.
Example: Example:
• To determine the difference in mean pH of rainfall • To determine the effectiveness of Kevin Zahari’s
in Shah Alam and Klang diet program. Participant’s weight before and after
program is measured

Let μ1 and μ2 be the mean of the first and second population respectively. We want to find the confidence interval of the
difference between the two population means μ1 -μ2. Then x1 − x2 is the sample statistic used to make the confidence
interval.
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Known)

Assumption

• Random sample
Two
OnePopulation
Population • The population is normal distributed
MeansMean
and Known
and • Small (n<30) or large (≥30) sample
σ2
unknown σ2 • σ12 and σ22 is known

Formula

The (1-α) 100% confidence level for two population mean, μ1 − μ2 is

σ12 σ22
(x1 −x2 ) ± zα +
2 n1 n2
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Known)
Example 2.14: An experiment was conducted in which two types of engines, A and B were compared. Gas mileage in miles
per gallon was measured. 75 experiments were conducted using engine type A and 50 experiments were done for engine type
B. The gasoline used and other conditions were held constant. The average gas mileage for engine A was 42 miles per gallon
and the average for engine B was 36 miles per gallon.
Find 96% confidence interval on μA -μB, where μA -μB are population mean gas mileage for engine A and engine B
respectively. Assume that the population standard deviation are 8 and 6 for engine A and B respectively.
(Ans: (3.4240, 8.5760))
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Unknown: σ12 = σ22 )

Formula

The (1-α) 100% confidence level for two population mean, μ1 − μ2 is


Two Population
Means, Unknown
σ2 and σ12 = σ22 1 1
(x1 −x2 ) ± t α,df sp +
2 n1 n2
Where
Assumption
df = n1 + n2 − 2
• One or both sample sizes are less than 30
• The population is normal distributed n1 − 1 s12 + (n2 − 1)s22
• σ12 and σ22 are unknown but the variances Sp =
n1 + n2 − 2
are assumed to be equal.
Sp=Pool standard deviation for two samples
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Unknown: σ12 = σ22 )
Example 2.15: An insurance company wants to know if the average speed at which men drive cars is greater than that of
women drivers. The company took a random sample of 26 cars driven by men on a highway and found the mean speed to be
72 miles per hour with a standard deviation of 2.2 miles per hour. Another sample of 16 cars driven by women on the same
highway gave a mean speed of 68 miles per hour with a standard deviation of 2.5 miles per hour. Assume that the speeds at
which all men and all women drive cars on this highway are both normally distributed with the same population standard
deviation.
Construct a 98% confidence interval for the difference between the mean speeds of cars driven by all men and all women on
this highway. (Ans: sp =2.3171, (2.2163, 5.7837))
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Unknown: σ12 = σ22 )
Example 2.15:
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Unknown: σ12 = σ22 )
Example 2.16: The manufacturer of a small battery-powered tape recorder decides to include four alkaline batteries with its
product. Two battery suppliers are being considered; each has its own brand (brand 1 and brand 2). The supervising inspector
of incoming quality wants to know if the average lifetimes of two brands are the same. Based on past experience, she
believes that the battery lifetimes follow a normal distribution with equal variances. A sample experiment is conducted: each
of ten batteries (five of each brand) is connected to a test device that places a small drain on the battery power and records
the battery lifetimes the following result (in hours) are obtained:
Brand 1 43 48 38 41 51
Brand 2 30 26 37 31 34

Use computer output below to answer the following questions.


Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Unknown: σ12 = σ22 )
Example 2.16:
a) Construct a 95% confidence interval on the differences between the average lifetimes of the two brands.
(Ans: (5.6815, 19.5185))
b) Can the supervising inspector of incoming quality conclude that the average lifetimes of the two brands are equal?
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Unknown: σ12 ≠ σ22 )

Formula

The (1-α) 100% confidence level for two population mean, μ1 − μ2 is


Two Population
Means, Unknown
σ2 and σ12 ≠ σ22 s12 s22
(x1 −x2 ) ± t α,df +
2 n1 n2
Where
Assumption

• One or both sample sizes are less than 30 2 2 2


s1 s2
• The population is normal distributed n1 + n2
• σ12 and σ22 are unknown but the variances df =
2 2 2 2
are assumed to be unequal. s1 s2
n1 n2
n1 − 1 + n2 − 1
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Unknown: σ12 ≠ σ22 )
Example 2.17: The breaking strengths of 11 bundles of wool fibres have a sample mean 436.5 and a sample standard
deviation of 11.90. In addition, the breaking strengths of another 12 bundles of synthetic fibres have a sample mean 452.8 and
a sample standard deviation 4.61. Assume the breaking strengths of the two populations are normally distributed with unequal
variances.
Construct a 95% confidence interval on the mean difference of breaking strengths between wool fibres and synthetic fibres.
Explain your answer. (Ans: (-24.6386,-7.9614))
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Unknown: σ12 ≠ σ22 )
Example 2.18: A set of facilitation tools to help with data analysis for problem solving is being developed by a group of
statisticians at UiTM. In order to test effectiveness of these tools, a group of research officers were asked to analyse and
produce a built-in report for a set of data on the computer. Twelve equally capable research officers were randomly selected
and six were randomly assigned a standard procedure to complete the task. The other six were asked to do the task using the
developed facilitation tools. The response measured was the time to completion (in minutes).
Group 1 (standard procedure) 61 69 68 74 58 63
Group 2 (facilitation tool) 32 42 40 34 38 33
Assume that the population variances of the two procedures are different. Construct a 99% confidence interval to estimate
the difference between the average completion times for the two procedures. (Ans: (19.1802, 38.8198))
Can you conclude that the facilitation tools increase the speed with which the task is completed by more than 20 minutes?
Estimation: Interval Estimation: Two Population Means (Independent: Variance σ2 Unknown: σ12 ≠ σ22 )
Example 2.18:
Estimation: Interval Estimation: Two Population Means (Dependent)

Formula

Two Population The (1-α) 100% confidence level for the mean difference between two
Means , observations from matched samples, μd is
Dependent
Samples sd
d ± t α,n−1
2 n

d
d=
n
Matched or paired samples
1 d 2
Sd = d2 −
• Involve a procedure whereby pairs of n−1 n
observations are matched as close as Where
possible according to certain relevant
characteristics μd = the mean of the paired differences of the population
d = the mean of paired differences of the sample
sd = the standard deviation of the paired difference of sample
n = the number of paired difference values
Estimation: Interval Estimation: Two Population Means (Dependent)
Example 2.19: The manufacturer of a gasoline additive claimed that the use of this additive increases gasoline mileage. A
random sample of six cars was selected and these cars were driven for one week without the gasoline additive and then for
one week with the gasoline additive. The following table gives the miles per gallon for these cars without and with the
gasoline additive. Assume that the population paired difference normally distributed.
Without 24.6 28.3 18.9 23.7 15.4 29.5
With 26.3 31.7 18.2 25.3 18.3 30.9

Construct a 95% confidence interval for the difference in mean mileage per gallon for cars without and with the gasoline
additive and interpret the interval. (Ans: (-3.2150, -0.2184))

.
Estimation: Interval Estimation: Two Population Means (Dependent)
Example 2.20: Many engineering students are having problems in data analysis using statistical software. A professor who
teaches statistics for engineering course offered a two day workshop on this topic. The following table gives the test scores
of seven engineering students before and after they attended the workshop.
Before 56 69 48 74 65 71 58
After 62 73 44 85 71 70 69

The data collected was analysed and the output is shown as follows:
Paired T-Test and CI: before, after

Paired T for before – after

N Mean StDev SE Mean


before 7 63.00 9.35 3.53
after 7 67.71 12.51 4.73
Difference 7 -4.71 5.65 2.13

95% CI for mean difference: (-9.94, 0.51)


T-Test of mean difference = 0 (vs ≠ 0): T-Value = -2.21 P-Value = 0.069

a) Show that 95% confidence interval for the difference in mean tests scores before and after attending the workshop is
between -9.94 and 0.51.
b) Can we conclude whether attending the workshop increases the test score?
Estimation: Interval Estimation: Two Population Means (Dependent)
Example 2.20:

.
Estimation: Interval Estimation: Population Variance

Confidence Interval for Population Variance

One Population Ratio of Two Population

Chi-Square
F- Distribution
Distribution

Characteristics

 All values ≥ 0 (cannot be negative)  Area under the curve = 1

 The distribution is skewed


to the right
Estimation: Interval Estimation: One Population Variance, σ2 / Population Standard Deviation, σ

One Population Assumption


Variance/ Std
Deviation The population is normally distributed.
(Chi-square
distribution) Formula

The (1- 𝛼) 100% confidence interval for a variance, 𝜎 2 :


Characteristics
n−1 s2 n−1 s2 n−1 s2 n−1 s2
• All chi-square values are greater < σ2 < χ2 @ , χ2
χ2α α χ2α α
2 ,n−1 1− 2 ,n−1 2 ,n−1 1− 2 ,n−1
than or equal to 0
• The chi-squared distribution is a
The (1- 𝛼) 100% confidence interval for a standard deviation, 𝜎:
family of curves based on degree
of freedom
• The area under each chi-square n−1 s2 n−1 s2 n−1 s2 n−1 s2
distribution curve is equal to 1 χ2α
<σ< χ2 α
@ χ2α
, χ2 α
2 ,n−1 1− 2 ,n−1 2 ,n−1 1− 2 ,n−1
• The distribution skewed to the
right
Where df=n-1
Estimation: Interval Estimation: One Population Variance, σ2 / Population Standard Deviation, σ
Example 2.21: Find the 95% confidence interval for the variance and standard deviation of the nicotine content of cigarettes
manufactured if a random sample of 20 cigarettes has a standard deviation of 1.6 milligrams. Assume that the variable is
normally distributed. (Ans: (1.4806 < σ2 < 5.4609), (1.2168< σ <2.3369))
Estimation: Interval Estimation: One Population Variance, σ2 / Population Standard Deviation, σ
Example 2.22: Find the 90% confidence interval for the variance and standard deviation for the price in dollars of an adult
single-day ski lift ticket. The data represent a selected sample of nationwide ski resorts. Assume the variable is normally
distributed.
59 54 53 52 51 39 49 46 49 48
Estimation: Interval Estimation: One Population Variance, σ2 / Population Standard Deviation, σ
Below is the Minitab output of the confidence interval for one variance using the data given in Example 2.22.
Test and CI for One Variance: ski_lift_ticket

Method

The chi-square method is only for the normal distribution.


The Bonett method is for any continuous distribution.

Statistics

Variable N StDev Variance


ski_lift_ticket 10 5.31 28.2

90% Confidence Intervals

CI for CI for
Variable Method StDev Variance
ski_lift_ticket Chi-Square (3.87, 8.74) (15.0, 76.4)
Bonett (3.35, 10.09) (11.2, 101.8)

Based on the confidence interval in the given output can we conclude that the standard deviation for the price in dollars of
an adult single-day ski lift ticket is differ?
Estimation: Interval Estimation: Ratio of Two Population Variances

Assumption
Two Population
Variances The population is normally distributed.
(F distribution)
Formula
The (1- 𝛼) 100% confidence interval for ratio of two population
Characteristics σ21
variances, is
σ22
s21 1 𝜎12 s21 s21 1 s21
• The F distribution is continuous and < < Fα,v @ , 2 Fα,v
s22 Fα 𝜎22 s22 2 2 ,v1 s22 Fα s2 2 2 ,v1
skewed to the right ,v ,v ,v ,v
2 1 2 2 1 2
• Shape of the F distribution depends on
two numbers of degree of The (1- 𝛼) 100% confidence interval for ratio of two population
σ
freedom(d.f);one for numerator and standard deviation, σ1 is
2
another one for the denominator
• The units of an F distribution are non- s21 1 𝜎1 s21 s21 1 s21
< < F α @ , Fα,v
negative and denoted by Fα,v1,v2 s22 Fα 𝜎2 s22 ,v ,v
2 2 1 s22 Fα s22 2 2 ,v1
,v1 ,v2 ,v1 ,v2
2 2
• where v1 is the d.f for numerator and
v2 is the d.f for denominator
Where v1 = n1 - 1 , v2 = n2 - 1
Estimation: Interval Estimation: Ratio of Two Population Variances
Example 2.23: The manufacturer of a small battery-powered tape recorder decides to include four alkaline batteries with its
product. Two battery suppliers are being considered; each has its own brand (brand 1 and brand 2). The supervising
inspector of incoming quality believes that the battery lifetimes follow a normal distribution with equal variances. A sample
experiment is conducted: each of ten batteries (five of each brand) is connected to a test device that places a small drain on
the battery power and records the battery lifetime the following results (in hours) are obtained:
Brand 1 43 48 38 41 51
Brand 2 30 26 37 31 34

a) Construct a 95% confidence interval on the ratio of the variances of lifetimes of the battery of the two brands. Interpret
the confidence interval obtained. (Ans: (0.1668, 15.3714))
b) Do the interval supports the supervising inspector’s believes that the variances lifetimes of the two brands are equal?
Estimation: Interval Estimation: Ratio of Two Population Variances
Example 2.23:
Estimation: Interval Estimation: Ratio of Two Population Variances
Example 2.24:The following Minitab output was obtained from two independent samples selected from two normally
distributed populations with unknown and unequal variances. Show the 95% confidence interval of the ratio of standard
deviations for the two populations are as given in the output.
Test and CI for Two Variances: S1, S3

Statistics

95% CI for
Variable N StDev Variance StDevs
S1 13 8.309 69.038 (5.958, 13.716)
S3 9 6.564 43.092 (4.434, 12.576)

Ratio of standard deviations = 1.266


Ratio of variances = 1.602

95% Confidence Intervals

CI for
CI for StDev Variance
Method Ratio Ratio
F (0.618, 2.372) (0.381, 5.626)

You might also like