You are on page 1of 10

Out line

 Inference about a Population Variance


 Inferences about Two Populations Variances

Inferences About
Population Variance
One Population scenario

1 2

Inferences About a Population Variance Inferences About a Population Variance

 A variance can provide important decision-making  Chi-Square Distribution


information.  Interval Estimation of 2
 Consider the production process of filling containers  Hypothesis Testing
with a liquid detergent product.  F-Distribution
 The mean filling weight is important, but also is the
variance of the filling weights.
 By selecting a sample of containers, we can compute
a sample variance for the amount of detergent placed
in a container.
 If the sample variance is excessive, overfilling and
underfilling may be occurring even though the mean
is correct.

3 4

Chi-Square Distribution Examples of Sampling Distribution of (n - 1)s2/ 2

 The chi-square distribution is the sum of squared


standardized normal random variables such as With 2 degrees
(z1)2+(z2)2+(z3)2 and so on. of freedom
With 5 degrees
 The chi-square distribution is based on sampling
of freedom
from a normal population.

 The sampling distribution of (n - 1)s /


2 
2 has With 10 degrees
of freedom
a chi-square distribution whenever a simple random
sample of size n is selected from a normal
population.
(n  1)s 2
 We can use the chi-square distribution to develop 2
0
interval estimates and conduct hypothesis tests
about a population variance.

5 6

1
Chi-Square Distribution Interval Estimation of 2

We will use the notation a to denote the value for


2

the chi-square distribution that provides an area of a ( n  1)s 2
to the right of the stated a value.
2  .975
2
   .025
2

2
 For example, there is a .95 probability of obtaining a .025
2 (chi-square) value such that .025
 .975
2
  2  .025
2
95% of the
possible 2 values
2
0 .975
2
.025
2

7 8

Interval Estimation of 2 Interval Estimation of 2

 There is a (1 – a) probability of obtaining a 2 value  Interval Estimate of a Population Variance


such that
 (1a / 2)    a / 2
2 2 2
( n  1) s 2 ( n  1) s 2
 2 
a/2
2
 (21a / 2)
 Substituting (n – 1)s2/2 for the 2 we get
(n  1) s 2 where the values are based on a chi-square
 (12 a / 2)   a2 / 2 distribution with n - 1 degrees of freedom and
2
where 1 - a is the confidence coefficient.
 Performing algebraic adjustments we get
( n  1) s 2 ( n  1) s 2
 2 
 2a / 2  (21a / 2)

9 10

Interval Estimation of  Interval Estimation of 2

 Interval Estimate of a Population Standard Deviation  Example: Buyer’s Digest (A)


Taking the square root of the upper and lower Buyer’s Digest rates thermostats manufactured for
limits of the variance interval provides the confidence home temperature control. In a recent test, 10
interval for the population standard deviation. thermostats manufactured by ThermoRite were
selected and placed in a test room that was
( n  1) s 2 (n  1) s 2 maintained at a temperature of 68oF. The
 
a2 / 2  (12 a / 2) temperature readings of the ten thermostats are
shown on the next slide.

11 12

2
Interval Estimation of 2 Interval Estimation of 2

 Example: Buyer’s Digest (A) For n - 1 = 10 - 1 = 9 d.f. and a = .05


We will use the 10 readings below to develop a Selected Values from the Chi-Square Distribution Table
95% confidence interval estimate of the population Degrees Area in Upper Tail
variance. of Freedom .99 .975 .95 .90 .10 .05 .025 .01
5 0.554 0.831 1.145 1.610 9.236 11.070 12.832 15.086
Thermostat 1 2 3 4 5 6 7 8 9 10 6 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812
7 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475
Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2
8 1.647 2.180 2.733 3.490 13.362 15.507 17.535 20.090
9 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666

10 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209

Our value

13 14

Interval Estimation of 2 Interval Estimation of 2

For n - 1 = 10 - 1 = 9 d.f. and a = .05 For n - 1 = 10 - 1 = 9 d.f. and a = .05

Selected Values from the Chi-Square Distribution Table


( n  1)s 2 Degrees Area in Upper Tail
2.700    .025
2

2 of Freedom .99 .975 .95 .90 .10 .05 .025 .01


5 0.554 0.831 1.145 1.610 9.236 11.070 12.832 15.086
.025 6 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812
Area in
7 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475
Upper Tail
8 1.647 2.180 2.733 3.490 13.362 15.507 17.535 20.090
= .975 9 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666
2
0 2.700 10 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209

Our value

15 16

Interval Estimation of 2 Interval Estimation of 2

n - 1 = 10 - 1 = 9 degrees of freedom and a = .05  Sample variance s2 provides a point estimate of  2.


2
 ( xi  x ) 6. 3
s2   . 70
( n  1)s 2 n 1 9
2.700   19.023
2  A 95% confidence interval for the population variance
is given by:
.025 Area in Upper
(10  1). 70 (10  1). 70
Tail = .025  2 
19. 02 2. 70

2 .33 < 2 < 2.33


0 2.700 19.023

17 18

3
Hypothesis Testing Hypothesis Testing
About a Population Variance About a Population Variance
 Left-Tailed Test  Left-Tailed Test (continued)
•Hypotheses •Rejection Rule
H0 :   2 2
0 Critical value approach: Reject H0 if  2   (12 a )
Ha :  2   2
0

where  02 is the hypothesized value p-Value approach: Reject H0 if p-value < a


for the population variance
where  (12 a ) is based on a chi-square
•Test Statistic distribution with n - 1 d.f.
( n  1) s 2
2 
 20

19 20

Hypothesis Testing Hypothesis Testing


About a Population Variance About a Population Variance
 Right-Tailed Test  Right-Tailed Test (continued)
•Hypotheses •Rejection Rule
H0 :    20
2
Critical value approach: Reject H0 if  2  a2
H a :  2   20

where  02 is the hypothesized value p-Value approach: Reject H0 if p-value < a


for the population variance
where a is based on a chi-square
2

•Test Statistic distribution with n - 1 d.f.


( n  1) s 2
2 
 20

21 22

Hypothesis Testing Hypothesis Testing


About a Population Variance About a Population Variance
 Two-Tailed Test  Two-Tailed Test (continued)
•Hypotheses •Rejection Rule
H0 :  2   20
Critical value approach:
H a :  2   20
Reject H0 if  2   (12 a /2 ) or  2  a2 /2
where  is the hypothesized value
2
0
for the population variance p-Value approach:

•Test Statistic Reject H0 if p-value < a


( n  1) s 2
 
2
where  (12 a /2) and a2 /2 are based on a
 20
chi-square distribution with n - 1 d.f.

23 24

4
Hypothesis Testing Hypothesis Testing
About a Population Variance About a Population Variance
 Example: Buyer’s Digest (B)  Example: Buyer’s Digest (B)
Recall that Buyer’s Digest is rating ThermoRite Using the 10 readings, we will conduct a
thermostats. Buyer’s Digest gives an “acceptable” hypothesis test (with a = .10) to determine whether
rating to a thermostat with a temperature variance the ThermoRite thermostat’s temperature variance is
of 0.5 or less. “acceptable”.
We will conduct a hypothesis test (with a = .10)
Thermostat 1 2 3 4 5 6 7 8 9 10
to determine whether the ThermoRite thermostat’s
temperature variance is “acceptable”. Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2

25 26

Hypothesis Testing Hypothesis Testing


About a Population Variance About a Population Variance
 Hypotheses For n - 1 = 10 - 1 = 9 d.f. and a = .10
H 0 :   0.5
2
Selected Values from the Chi-Square Distribution Table
Right-
H a :  2  0.5 Degrees Area in Upper Tail
tailed of Freedom .99 .975 .95 .90 .10 .05 .025 .01
test
 Rejection Rule 5 0.554 0.831 1.145 1.610 9.236 11.070 12.832 15.086
6 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812
Reject H0 if 2 > 14.684 7 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475
8 1.647 2.180 2.733 3.490 13.362 15.507 17.535 20.090
9 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666

10 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209

Our value

27 28

Hypothesis Testing Hypothesis Testing


About a Population Variance About a Population Variance
 Rejection Region  Test Statistic
The sample variance s 2 = 0.7
(n  1)s 2 9s 2
2   9(.7)
2 .5 2   12.6
.5
Area in Upper  Conclusion
Tail = .10
Because 2 = 12.6 is less than 14.684, we cannot
reject H0. The sample variance s2 = .7 is insufficient
2 evidence to conclude that the temperature variance
0 14.684
for ThermoRite thermostats is unacceptable.
Reject H0

29 30

5
Hypothesis Testing Container Filling machine
About a Population Variance
Container-filling machines are used to package a variety of
 Using the p-Value liquids, including milk, soft drinks, and paint. Ideally, the
• The rejection region for the ThermoRite amount of liquid should vary only slightly because large
thermostat example is in the upper tail; thus, the variations will cause some containers to be underfilled
appropriate p-value is less than .90 (2 = 4.168) (cheating the customer) and some to be overfilled
and greater than .10 (2 = 14.684). (resulting in costly waste). The president of a company
• Because the p –value > a = .10, we cannot that developed a new type of machine boasts that this
reject the null hypothesis. machine can fill 1-liter (1,000 cubic centimeters) containers
so consistently that the variance of the fills will be less than
• The sample variance of s 2 = .7 is insufficient 1 CC2. To examine the veracity of the claim, a random
evidence to conclude that the temperature sample of 25 l-liter fills was taken, and the results (cubic
variance is unacceptable (>.5). centimeters) recorded. These data are listed here. Do these
The exact p-value is .18156. data allow the president to make this claim at the 5%
significance level?

31 32

Data Inferences About Two Population Variances

 We may want to compare the variances in:


 product quality resulting from two different
production processes,
 temperatures for two heating devices, or
 assembly times for two assembly methods.

 We use data collected from two independent random


sample, one from population 1 and another from
population 2.

 The two sample variances will be the basis for making


inferences about the two population variances.

33 34

Hypothesis Testing About the Hypothesis Testing About the


Variances of Two Populations Variances of Two Populations
 One-Tailed Test  One-Tailed Test (continued)
•Hypotheses •Rejection Rule
H 0 :  12   22
Critical value approach: Reject H0 if F > Fa
H a :  12   22
where the value of Fa is based on an
Denote the population providing the F distribution with n1 - 1 (numerator)
larger sample variance as population 1. and n2 - 1 (denominator) d.f.
•Test Statistic p-Value approach: Reject H0 if p-value < a
2
F  s1
s22

35 36

6
Hypothesis Testing About the Hypothesis Testing About the
Variances of Two Populations Variances of Two Populations
 Two-Tailed Test  Two-Tailed Test (continued)
•Hypotheses •Rejection Rule
H0 :  12   22
Critical value approach: Reject H0 if F > Fa/2
Ha :  12   22
where the value of Fa/2 is based on an
Denote the population providing the F distribution with n1 - 1 (numerator)
larger sample variance as population 1. and n2 - 1 (denominator) d.f.
•Test Statistic
2 p-Value approach: Reject H0 if p-value < a
F  s1
s22

37 38

Hypothesis Testing About the Hypothesis Testing About the


Variances of Two Populations Variances of Two Populations
 Example: Buyer’s Digest (C)  Example: Buyer’s Digest (C)
Buyer’s Digest has conducted the same test, as was ThermoRite Sample
described earlier, on another 10 thermostats, this time
Thermostat 1 2 3 4 5 6 7 8 9 10
manufactured by TempKing. The temperature readings
of the ten thermostats are listed on the next slide. Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2

We will conduct a hypothesis test with a = .10 to see TempKing Sample


if the variances are equal for ThermoRite’s thermostats
and TempKing’s thermostats. Thermostat 1 2 3 4 5 6 7 8 9 10
Temperature 67.7 66.4 69.2 70.1 69.5 69.7 68.1 66.6 67.3 67.5

39 40

Hypothesis Testing About the Hypothesis Testing About the


Variances of Two Populations Variances of Two Populations
 Hypotheses Selected Values from the F Distribution Table
H 0 :  12   22 (TempKing and ThermoRite thermostats Denominator Area in Numerator Degrees of Freedom
have the same temperature variance)
Degrees Upper
H a : 12   22 (Their variances are not equal) of Freedom Tail 7 8 9 10 15
8 .10 2.62 2.59 2.56 2.54 2.46
 Rejection Rule
.05 3.50 3.44 3.39 3.35 3.22
The F distribution table (on next slide) shows that with
.025 4.53 4.43 4.36 4.30 4.10
with a = .10, 9 d.f. (numerator), and 9 d.f. (denominator), .01 6.18 6.03 5.91 5.81 5.52
F.05 = 3.18.
Reject H0 if F > 3.18 9 .10 2.51 2.47 2.44 2.42 2.34
.05 3.29 3.23 3.18 3.14 3.01
.025 4.20 4.10 4.03 3.96 3.77
.01 5.61 5.47 5.35 5.26 4.96

41 42

7
Hypothesis Testing About the Hypothesis Testing About the
Variances of Two Populations Variances of Two Populations
 Test Statistic  Determining and Using the p-Value
TempKing’s sample variance is 1.768
Area in Upper Tail .10 .05 .025 .01
ThermoRite’s sample variance is .700
2
F Value (df1 = 9, df2 = 9) 2.44 3.18 4.03 5.35
Fs 1 = 1.768/.700 = 2.53
s22 • Because F = 2.53 is between 2.44 and 3.18, the area
Conclusion in the upper tail of the distribution is between .10
We cannot reject H0. F = 2.53 < F.05 = 3.18. and .05.
There is insufficient evidence to conclude that • But this is a two-tailed test; after doubling the
the population variances differ for the two upper-tail area, the p-value is between .20 and .10.
thermostat brands. • Because a = .10, we have p-value > a and therefore
we cannot reject the null hypothesis.

43 44

Ex: Which Delivery is better Ex: Roads & Speed limit

An important statistical measurement in service A new highway has just been completed, and the government
facilities (such as restaurants and banks) is the must decide on speed limits. There are several possible
variability in service times. As an experiment, two food choices. However, on advice from police who monitor traffic,
delivery services (Zomato and Swiggy). Delivery times the objective was to reduce the variation in speeds, which it is
to TPK were observed, and the service times for each of thought to contribute to the number of collisions. It has been
100 customers were recorded. Do these data allow us to acknowledged that speed contributes to the severity of
infer at the 10% significance level that the variance in collisions. It is decided to conduct an experiment to acquire
service times of Zomato is lesser than Swiggy? more information. Signs are posted for 1 week indicating that
the speed limit is 70 mph. A random sample of cars’ speeds is
measured. During the second week, signs are posted
Data indicating that the maximum speed is 70 mph and that the
minimum speed is 60 mph. Once again, a random sample of
speeds is measured. Can we infer that limiting the minimum
and maximum speeds reduces the variation in speeds?
Data

45 46

End of First Trimester

DAM-I Course Inference about the difference between


two population means

Matched Pairs

We will meet again…

47 48

8
Inferences About the Difference Between Inferences About the Difference Between
Two Population Means: Matched Samples Two Population Means: Matched Samples
 With a matched-sample design each sampled item  Example: Express Deliveries
provides a pair of data values. A Chennai-based firm has documents that must
 This design often leads to a smaller sampling error be quickly distributed to state offices throughout
than the independent-sample design because the India. The firm must decide between two
variation between sampled items is eliminated as a delivery services, Speed Post (India Postal Service)
source of sampling error. and Blue Dart (DHL Express), to transport its
documents.

49 50

Inferences About the Difference Between Inferences About the Difference Between
Two Population Means: Matched Samples Two Population Means: Matched Samples
 Example: Express Deliveries
Delivery Time (Hours)
In testing the delivery times of the two services,
District Office BD SP Difference
the firm sent two reports to a random sample of its
Lucknow 32 25 7
state offices with one report carried by Speed Post Bhopal 30 24 6
and the other report carried by Blue Dart. Do the Bengaluru 19 15 4
data on the next slide indicate a difference in mean Kochi 16 15 1
delivery times for the two services? Use a .05 level of Mumbai 15 13 2
significance. Kolkata 18 15 3
Bhubaneswar 14 15 -1
Hyderabad 10 8 2
New Delhi 7 9 -2
Goa 16 11 5

51 52

Inferences About the Difference Between Inferences About the Difference Between
Two Population Means: Matched Samples Two Population Means: Matched Samples
 p –Value and Critical Value Approaches  p –Value and Critical Value Approaches
1. Develop the hypotheses. 2. Specify the level of significance. a = .05
H0: d = 0 
3. Compute the value of the test statistic.
Ha: d 
Let d = the mean of the difference values for the  d i ( 7  6... 5)
d    2. 7
two delivery services for the population n 10
of district offices 2
 ( di  d ) 76.1
sd    2. 9
n 1 9
d  d 2.7  0
t   2.94
sd n 2.9 10

53 54

9
Inferences About the Difference Between Inferences About the Difference Between
Two Population Means: Matched Samples Two Population Means: Matched Samples
 p –Value Approach  Critical Value Approach
4. Compute the p –value. 4. Determine the critical value and rejection rule.
For t = 2.94 and df = 9, the p–value is between For a = .05 and df = 9, t.025 = 2.262.
.02 and .01. (This is a two-tailed test, so we double
Reject H0 if t > 2.262
the upper-tail areas of .01 and .005.)
5. Determine whether to reject H0.
5. Determine whether to reject H0.
Because t = 2.94 > 2.262, we reject H0.
Because p–value < a = .05, we reject H0.
We are at least 95% confident that there is a
We are at least 95% confident that there is a
difference in mean delivery times for the two
difference in mean delivery times for the two
services?
services?

55 56

SUMMARY
IFERENCIAL
STATISTICS

57 58

10

You might also like