You are on page 1of 29

CHAPTER 3

Chi-Square Tests

a) Hypothesis Tests about the Variance (Single Variance)


b) Hypothesis Tests about the Variance (Two Variances)
Chi – Square Tests
The Chi-square Distribution

• like the t distribution, the chi-square distribution has only one parameter
called the degrees of freedom (df). The shape of a specific chi-square
distribution depends on the number of degrees of freedom
• The degrees of freedom for a chi-square distribution are calculated by using
different formula for different tests.
• The random variable  2 assume nonegative values only. Hence, a chi-square
distribution curve starts at the origin (zero point) and lies entirely to the right of
the vertical axis.

Figure A: three chi-square distribution curve


• As we can see from Figure A, the shape of the chi-square distribution
curve is skewed for very small degrees of freedom, and it changes
drastically as the degrees of freedom increase.

• Eventually, for large df, the chi square distribution curve looks like a
normal distribution curve (become symmetric).

• The peak (or model) of a chi-square, the total area under a chi-square
distribution curve is 1.0.

How to find the value of  from  distribution table.


2 2

If we know the df and the area in the right tail of a chi-square distribution
curve, we can find the value of  from table. The following example
2

shows how to read that table.


EXAMPLE 1 :
Find the value of  2
for 7 df and an area of 0.10 in the right tail of the chi-
square distribution curve.

SOLUTION :
To find the required value of  2
, we locate 7 in the column for df and 0.100
in the top row of the table.
EXAMPLE 2:

Find the value of  distribution table only when an area in the right tail of
2

the chi-square distribution curve is known. When the given area is in the
left tail, the first step is to find the area in the right tail of chi-square
distribution distribution curve as follows.

area in the right tail = 1 – Area in the left tail


= 1 – 0.05 = 0.95

Next, we locate 12 in the column for df and 0.95 in the top row of the table.
df=7 df=12

0.10 0.05
Shaded area of 0.05

Value for df=7 and area of Value for df=12 and area of
 2
0.10 in the right tail.  2 in the left tail.
0.05
EXERCISES:

1. Describe the chi-square distribution. What is the parameter of such a


distribution?

2. Find the value of  for 28 degrees of freedom and an area of 0.05 in the
2

right tail of the chi-square distribution curve.

3. Determine the value of  for 23 degrees of freedom and an area of 0.990


2

in the left tail of the chi-square distribution curve.


HYPOTHESIS TEST ABOUT THE
POPULATION VARIANCE
(SINGLE VARIANCE)
• The chi-square distribution also used to test a claim about a single
variance or std. dev.

• To find the area under the chi-square distribution, use the  table. After
2

the df reach 30, the table gives values only for multiples of 10. when the
exact degrees of freedom one is seeking are not specifies in the table, the
closest smaller value should be used.

Formula for chi-square Test for a single variance.

 
2 n  1s 2

2

With degrees of freedom equal to n-1 and where


n2 = sample size
s = sample variance
 2= population variance
Assumption for the Chi-square Test for a Single Variance

1. The sample must be randomly selected from the population

2. The population must be normally distributed for the variable under study.

3. The observation must be independent of one another.

EXAMPLE:
An instructor wishes to see whether the variation in scores of the 23 students in her class is
less than the variance of the population. The variance of the class is 198. Is there enough
evidence to support the claim that the variation of the students is less than the population
variance (  225) at   0.05? Assume that the scores are normally distributed.
2

SOLUTION:

STEP 1: State the hypothesis and identify the claim.


H 0 :  2  225 and H1 :  2  225(claim)
STEP 2: Find the critical value. Since this test is left –tailed and   0.05
use the value 1 – 0.05 = 0.95. The df are n-1 = 23-1 =22. Hence,
the critical value is 12.338. Note that the critical region is on the
left.

STEP 3: Compute the test value

(n  1) s 2 (23  1)(198)
 
2
  19.36
2 225

STEP 4: Make the decision. Since the test value 19.36 fall in the
noncritical region, the decision is not to reject the null hypothesis.

STEP 5: Summarize the results. There is not enough evidence to


support the claim that the variation in the test scores of the
instructor’s students is less than the variation scores of the
population.
EXAMPLE:

The variance of scores on the standardized mathematics test for all high
school seniors was 150 in 1999. A sample of scores for 20 high school seniors
who took this test this year gave a variance of 170. Test at the 55 significance
level if the variance of current scores of all high school seniors on this test is
different from 150. Assume that the scores of all high school seniors on this
school seniors on this test are (approximately) normally distributed.

SOLUTION:
From the given information,

n  20,   0.05, s 2  170

The population variance was 150 in 1999.


STEP 1: State the null and alternative hypothesis.

H1 :  2  150 (The population variance is not different from 150)


H 0 :  2  150 (The population variance is different from 150)

STEP 2: Select the distribution to use.


We use the chi-square distribution to test a hypotheses about  2

STEP 3: Determine the rejection and nonrejection regions.


The significance level is 5%. The  and H1 indicates that the test is two-
tailed. the rejection region lies in both tails of the chi-square distribution curve
with its total area equal to 0.05. consequently, the area in each tail of the

distribution curve is 0.025. The value of 2 and 1   2 are

 0.05 
  0.025 and 1   1  0.025  0.975
2 2 2
• The degrees of freedom are
df = n – 1 = 20 – 1 = 19
• From the statistical table, the critical value of  for 19 degrees of freedom
2

and for  2 and 1   2 areas in the right tail are


 2 for 19 df and 0.025 area in the right tail = 32.582
 2 for 19 df and 0.975 area in the right tail = 8.907

• These two values are shown below:

Reject H 0 Not Reject H0 Reject H 0

  0.025   0.025
2 2

2
8.907 32.852
Two critical value of 
2
STEP 4: Calculate the value of the test statistic.
The value of the test statistic  for the sample variance is calculated
2

as follows:

STEP 5: Make a decision.


The value of the test statistic  2  21.533 is between the two critical
values of  , 8.907 and 32.852, and it falls in the nonrejection
2

region. Consequently, we fail to reject H 0 and conclude that the


population variance of the current scores of high school seniors on
this standardized mathematics test does not appear to be different
from 150.

Note: We can make a test of hypothesis about the population std. dev. Using
the same procedure as that for the population variance. To make a test of
hypothesis about  , the only change will be mentioning the values of  in
H 0 and H1
p-value :

• Approximate p-values for the chi-square test can be found in the  table.
2

The procedure is somewhat more complicated than the previous procedures


finding P-values for the z and t test since the chi-square distribution is not
exactly symmetric and  2 values cannot be negative.

• As we did for the t-test, we will determine and interval for the P-value
based on the table.

EXAMPLE:
Find the p-value when  2  19.274and the test is right tailed.

SOLUTION:
By refer to table, to get the p-value, look across the row with df=7 and find
the two values that 19.274 falls between. They are 18.475 and 20.278. Look
up to the top row and find the  values corresponding to this value. They
are 0.01 and 0.05, respectively. Hence the p-value is contained in the
interval.
0.005 < p-value < 0.01

NOTE: When the  test is two-tailed, both interval values must be double.
2
EXERCISES:

1. The 2-inch long bolts manufactured by a company must have a variance of 0.003 square inch or less for
acceptance by a buyer. A random sample of 26 such bolts gave a variance of 0.0061 square inch.

a) Test at the 1% significance level whether the variance of all such bolt is greater 0.003 square inch.
Assume that the length of all 2 inch long bolts manufactured by this company are (approximately)
normally distributed.

b) Make the 98% confidence intervals for the population variance and standard deviation.
HYPOTHESIS TEST ABOUT THE
POPULATION VARIANCE:
(TWO VARIANCES)
• For the comparison of two variances or standard deviations, an F test is
used. The F test should not be confused with the chi-square test, which
compares a single sample variance to a specific variation.

• The two independent samples are selected from two normally distributed
populations in which the variances are equal  1   2 and if the variances s1
and s2 are compared as s1 s2 , the sampling distribution of the variances is
called the F distribution.

Characteristics of the F distribution

1. The values of F cannot be negative, because variances are always positive


or 0.

2. The distribution is positively skewed.

3. The mean value of F is approximately equal to 1.

4. The F distribution is a family of curves based on the df of the variance of


the numerator and the df of the variance of the denominator.
Formula for F test
s1
F 
s2
Where s12 is the larger of the two variances.
The F test has two terms for the degrees of freedom : that of the numerator, n1  1
and that of the denominator, n2  1 , where n1 is the sample size from
which the larger variance was obtained.

The F critical value of   0.005,0.01,0.0025,0.05,0.10 can be found on


table ( Each  value involves a separate table).
EXAMPLE:
Find the critical value for two tailed F test with   0.05when the sample size
from which the variance for the variance for the numerator was obtained was
21 and the sample size from which the variance for the denominator was
obtained was 12.

SOLUTION:
Since this us two-tailed test with   0.05 , the 0.05/2 = 0.025 table must be
used. Here, the degrees of freedom for numerator (d.f.N) = 21 -1 =20 and d.f.D
= 12 – 1 =11; hence, the critical value is 3.23.
1. As noted previously, when the F test is used, the larger variance is always
placed in the numerator of the formula. When one is conducting a two-
tailed test,  is split ; and even though there are two values, only the
right tail is used. The reason is that the F test value is always greater than
or equal to 1.

Assumptions for Testing the Difference Between Two Variances

1. The populations from which the sample were obtained must be normally
distributed. (Note: The test should not be used when the distributions
depart from normality).

2. The samples must be independent of each other.


EXAMPLE:
A medical researcher wishes to see whether the variance of the heart rates
(in beats per minutes) of smokers in difference from the variance of heart
rates of people who do not smoke. Two samples are selected, and the data
are as shown. Using   0.05, is there enough evidence to support the
claim?

Smokers Nonsmokers
n1  26 n2  18
s12  36 s 22  10

SOLUTION:

STEP 1: State the hypotheses and identify the claim,

H 0:  12   22 and H1   12   22 (claim)
STEP 2: Find the critical value. Use the 0.025 table since   0.05 and
this is a two-tailed test, d.f.N = 26 – 1 = 25, and d.f.D. = 18 -1 =17. The
critical value is 2.56 (d.f.N. = 24 was used).

STEP 3: Compute the test value.

s12 36
F  2   3.6
s2 10

STEP 4: Make the decision. Reject the null hypothesis, since 3.6>2.56.

STEP 5: Summarize the result. There is enough evidence to support the


claim that the variance of the heart rates of smokers and nonsmokers is
different.
p-value

• Finding p-values for the F test statistic is somewhat more complicated


since it requires looking through all the F tables using the specific d.f.N.
and d.f.D values.

• For example, suppose that a certain test has F=3.58, d.f.N. = 5, and
d.f.D. = 10. To find the p-value interval for F= 3.58, one must first find the
corresponding F values for d.f.N. = 5 and d.f.D. = 10 for  equal to 0.005,
0.01, 0.025, 0.05, and 0.10. Then make a table as shown.

 0.10 0.05 0.025 0.01 0.005

F 2.52 3.33 3.42 5.64 6.87

• Now locate the two F values that the test value 3.58 falls between. In this
case, 3.58 falls between 3.33 and 4.24. Hence the p-value for a right-
tailed test for F=3.58 falls between 0.025 and 0.05.
EXERCISES:

1. When one is computing the F test value, what condition is placed on the
variance that is in the numerator?

2. What are the characteristic of the F distribution?

3. By using the table, find the critical value for each.


a) Sample 1 : s11  128, n1  23 b) Sample 1: s11  37, n1  14
Sample 2 : s 2  162, n  16 Sample 2: s 2  89, n  25
2 2 2 2

two  tailed   0.01 two  tailed   0.01

4. Find the p-Value interval for each F test value.


a) F  2.97, d . f .N .  9, d . f .N .  14, right  tailed
b) F  3.32, d . f .N  6, d . f .N .  12, two  tailed
TASK 8
1. The manufacturer of a certain brand of light bulbs claim that the variance
of the lives of these bulbs in 4200 square hours. A consumer agency
took a random sample of 25 such bulbs and tested them. The variance of
the lives of these bulbs and tested them. The variance of the lives of
these bulbs was found to be 5200 square hours. Assume that the lives of
all such bulbs are (approximately) normally distributed.

a) Make the 99% confidence intervals for the variance and standard
deviation of the lives of all such bulbs.

b) Test at 5% significance level whether the variance of such bulbs is


different from 4200 square hours.
TASK 8 (continue)
2. The weight in ounces of a sample of running shoes for men and women
are shown below. Calculate the variance for each sample and test the
claim that the variances equal at   0.05. Use the p-value method.

Men Women
11.9 10.4 12.6 10.6 10.2 8.8
12.3 11.1 14.7 9.6 9.5 9.5
9.2 10.8 12.9 10.1 11.2 9.3
11.2 11.7 13.3 9.4 10.3 9.5
13.8 12.8 14.5 9.8 10.3 11.0

You might also like