Professional Documents
Culture Documents
Chi-Square Tests
• like the t distribution, the chi-square distribution has only one parameter
called the degrees of freedom (df). The shape of a specific chi-square
distribution depends on the number of degrees of freedom
• The degrees of freedom for a chi-square distribution are calculated by using
different formula for different tests.
• The random variable 2 assume nonegative values only. Hence, a chi-square
distribution curve starts at the origin (zero point) and lies entirely to the right of
the vertical axis.
• Eventually, for large df, the chi square distribution curve looks like a
normal distribution curve (become symmetric).
• The peak (or model) of a chi-square, the total area under a chi-square
distribution curve is 1.0.
If we know the df and the area in the right tail of a chi-square distribution
curve, we can find the value of from table. The following example
2
SOLUTION :
To find the required value of 2
, we locate 7 in the column for df and 0.100
in the top row of the table.
EXAMPLE 2:
Find the value of distribution table only when an area in the right tail of
2
the chi-square distribution curve is known. When the given area is in the
left tail, the first step is to find the area in the right tail of chi-square
distribution distribution curve as follows.
Next, we locate 12 in the column for df and 0.95 in the top row of the table.
df=7 df=12
0.10 0.05
Shaded area of 0.05
Value for df=7 and area of Value for df=12 and area of
2
0.10 in the right tail. 2 in the left tail.
0.05
EXERCISES:
2. Find the value of for 28 degrees of freedom and an area of 0.05 in the
2
• To find the area under the chi-square distribution, use the table. After
2
the df reach 30, the table gives values only for multiples of 10. when the
exact degrees of freedom one is seeking are not specifies in the table, the
closest smaller value should be used.
2 n 1s 2
2
2. The population must be normally distributed for the variable under study.
EXAMPLE:
An instructor wishes to see whether the variation in scores of the 23 students in her class is
less than the variance of the population. The variance of the class is 198. Is there enough
evidence to support the claim that the variation of the students is less than the population
variance ( 225) at 0.05? Assume that the scores are normally distributed.
2
SOLUTION:
(n 1) s 2 (23 1)(198)
2
19.36
2 225
STEP 4: Make the decision. Since the test value 19.36 fall in the
noncritical region, the decision is not to reject the null hypothesis.
The variance of scores on the standardized mathematics test for all high
school seniors was 150 in 1999. A sample of scores for 20 high school seniors
who took this test this year gave a variance of 170. Test at the 55 significance
level if the variance of current scores of all high school seniors on this test is
different from 150. Assume that the scores of all high school seniors on this
school seniors on this test are (approximately) normally distributed.
SOLUTION:
From the given information,
0.05
0.025 and 1 1 0.025 0.975
2 2 2
• The degrees of freedom are
df = n – 1 = 20 – 1 = 19
• From the statistical table, the critical value of for 19 degrees of freedom
2
0.025 0.025
2 2
2
8.907 32.852
Two critical value of
2
STEP 4: Calculate the value of the test statistic.
The value of the test statistic for the sample variance is calculated
2
as follows:
Note: We can make a test of hypothesis about the population std. dev. Using
the same procedure as that for the population variance. To make a test of
hypothesis about , the only change will be mentioning the values of in
H 0 and H1
p-value :
• Approximate p-values for the chi-square test can be found in the table.
2
• As we did for the t-test, we will determine and interval for the P-value
based on the table.
EXAMPLE:
Find the p-value when 2 19.274and the test is right tailed.
SOLUTION:
By refer to table, to get the p-value, look across the row with df=7 and find
the two values that 19.274 falls between. They are 18.475 and 20.278. Look
up to the top row and find the values corresponding to this value. They
are 0.01 and 0.05, respectively. Hence the p-value is contained in the
interval.
0.005 < p-value < 0.01
NOTE: When the test is two-tailed, both interval values must be double.
2
EXERCISES:
1. The 2-inch long bolts manufactured by a company must have a variance of 0.003 square inch or less for
acceptance by a buyer. A random sample of 26 such bolts gave a variance of 0.0061 square inch.
a) Test at the 1% significance level whether the variance of all such bolt is greater 0.003 square inch.
Assume that the length of all 2 inch long bolts manufactured by this company are (approximately)
normally distributed.
b) Make the 98% confidence intervals for the population variance and standard deviation.
HYPOTHESIS TEST ABOUT THE
POPULATION VARIANCE:
(TWO VARIANCES)
• For the comparison of two variances or standard deviations, an F test is
used. The F test should not be confused with the chi-square test, which
compares a single sample variance to a specific variation.
• The two independent samples are selected from two normally distributed
populations in which the variances are equal 1 2 and if the variances s1
and s2 are compared as s1 s2 , the sampling distribution of the variances is
called the F distribution.
SOLUTION:
Since this us two-tailed test with 0.05 , the 0.05/2 = 0.025 table must be
used. Here, the degrees of freedom for numerator (d.f.N) = 21 -1 =20 and d.f.D
= 12 – 1 =11; hence, the critical value is 3.23.
1. As noted previously, when the F test is used, the larger variance is always
placed in the numerator of the formula. When one is conducting a two-
tailed test, is split ; and even though there are two values, only the
right tail is used. The reason is that the F test value is always greater than
or equal to 1.
1. The populations from which the sample were obtained must be normally
distributed. (Note: The test should not be used when the distributions
depart from normality).
Smokers Nonsmokers
n1 26 n2 18
s12 36 s 22 10
SOLUTION:
H 0: 12 22 and H1 12 22 (claim)
STEP 2: Find the critical value. Use the 0.025 table since 0.05 and
this is a two-tailed test, d.f.N = 26 – 1 = 25, and d.f.D. = 18 -1 =17. The
critical value is 2.56 (d.f.N. = 24 was used).
s12 36
F 2 3.6
s2 10
STEP 4: Make the decision. Reject the null hypothesis, since 3.6>2.56.
• For example, suppose that a certain test has F=3.58, d.f.N. = 5, and
d.f.D. = 10. To find the p-value interval for F= 3.58, one must first find the
corresponding F values for d.f.N. = 5 and d.f.D. = 10 for equal to 0.005,
0.01, 0.025, 0.05, and 0.10. Then make a table as shown.
• Now locate the two F values that the test value 3.58 falls between. In this
case, 3.58 falls between 3.33 and 4.24. Hence the p-value for a right-
tailed test for F=3.58 falls between 0.025 and 0.05.
EXERCISES:
1. When one is computing the F test value, what condition is placed on the
variance that is in the numerator?
a) Make the 99% confidence intervals for the variance and standard
deviation of the lives of all such bulbs.
Men Women
11.9 10.4 12.6 10.6 10.2 8.8
12.3 11.1 14.7 9.6 9.5 9.5
9.2 10.8 12.9 10.1 11.2 9.3
11.2 11.7 13.3 9.4 10.3 9.5
13.8 12.8 14.5 9.8 10.3 11.0