7ChiSq Fdist 05 Online PDF

Chi-Square & F Distributions
Carolyn J. Anderson
EdPsych 580
Fall 2005
Chi-Square & F Distributions – p. 1/55

. . . and Inferences about Variances
• The Chi-square Distribution
• Definition, properties, tables of, density calculator
• Testing hypotheses about the variance of a single

population
(i.e., Ho : σ 2 = K). Example.
• The F Distribution
• Definition, important properties, tables of
• Testing the equality of variances of two independent

populations
(i.e., Ho : σ12 = σ22 ). Example.
. . . and Inferences about Variances

• Comments regarding testing the homogeneity
of variance assumption of the two
independent groups t–test (and ANOVA).
• Relationship among the Normal, t, χ2 , and F
distributions.

• Motivation. The normal and t distributions are
useful for tests of population means, but often
we may want to make inferences about
population variances.
• Examples:
• Does the variance equal a particular value?
• Does the variance in one population equal the

variance in another population?
• Are individual differences greater in one population
than another population?
• Are the variances in J populations all the same?
• Is the assumption of homogeneous variances

• To make statistical inferences about
populations variance(s), we need
• χ2 −→ The Chi-square distribution (Greek
“chi”).
• F−→ Named after Sir Ronald Fisher who
developed the main applications of F.
• The χ2 and F–distributions are used for many
problems in addition to the ones listed above.
• They provide good approximations to a large
class of sampling distributions that are not
easily determined.
The Big Five Theoretical Distributions
• The Big Five are Normal, Student’s t, χ2 , F,
and the Binomial (π, n).
• Plan:
• Introduce χ2 and then the F distributions.
• Illustrate their uses for testing variances.
• Summarize and describe the relationship
among the Normal, Student’s t, χ2 and F.

The Chi-Square Distributions
• Suppose we have a population with scores Y
that are normally distributed with mean
E(Y ) = µ and variance = var(Y ) = σ 2 (i.e.,
Y ∼ N (µ, σ 2 )).
• If we repeatedly take samples of size n = 1
and for each “sample” compute
2
(Y − µ)
z2 = 2
= squared standard score
σ
• Define χ21 = z 2
• What would the sampling distribution of χ21
look like?
The Chi-Square Distribution, ν = 1

• χ21 are non-negative Real numbers
• Since 68% of values from N (0, 1) fall between
−1 to 1, 68% of values from χ21 distribution
must be between 0 and 1.
• The chi-square distribution with ν = 1 is very
skewed.

• Repeatedly draw independent (random)
samples of n = 2 from N (µ, σ 2 ).
• Compute Z12 = (X1 − µ)2 /σ 2 and
Z22 = (X2 − µ)2 /σ 2 .
•
•
Compute the sum: χ22 = Z12 + Z22 .

• All value non-negative
• A little less skewed than χ21 .
• The probability that χ22 falls in the range of 0
to 1 is smaller relative to that for χ21 . . .
2
P (χ1 ≤ 1) = .68
P (χ22 ≤ 1) = .39
• Note that mean ≈ ν = 2. . . .

Chi-Square Distributions
• Generalize: For n independent observations
from a N (µ, σ 2 ), the sum of the squared
standard scores has a Chi-square distribution
with n degrees of freedom.
• Chi–squared distribution only depends on
degrees of freedom, which in turn depends
on sample size n.
• The standard scores are computed using
population µ and σ 2 ; however, we usually
don’t know what µ and σ 2 equal. When µ and
σ 2 are estimated from the sampled data, the
degrees of freedom are less than n.
Chi-Square Dist: Varying ν

2
Properties of Family of χ Distributions
• They are all positively skewed.
• As ν gets larger, the degree of skew
decreases.
• As ν gets very large, χ2ν approaches the
normal distribution.
Why? The Central Limit Theorem (for sums):

Consider a random sample of size n from a population
distribution having mean µ and variance σ 2 . If n is
Pn
sufficiently large, then the distribution of i=1 Yi is
approximately normal with mean nµ and variance σ 2 .

2
• E(χ2ν ) = mean = ν = degrees of freedom.
• E[(χ2ν − E(χ2ν ))2 ] = var(χ2ν ) = 2ν.
• Mode of χ2ν is at value ν − 2 (for ν ≥ 2).
• Median is approximately = (3ν − 2)/3 (for
ν ≥ 2).

2
IF
• A random variable χ2ν1 has a chi-squared
distribution with ν1 degrees of freedom, and
• A second independent random variable χ2ν2
has a chi-squared distribution with ν2 degrees
of freedom,
THEN
χ2(ν1 +ν2 ) = χ2ν1 + χ2ν2
their sum has a chi-squared distribution with
(ν1 + ν2 ) degrees of freedom.
2
Percentiles of χ Distributions
2 2 2
Note: χ
.95 1 = 3.84 = 1.96 = z.95
• Tables
• http://calculator.stat.ucla.edu/cdf/
• pvalue.f program or the executable version,
pvalue.exe, on the course web-site.
• SAS: PROBCHI(x,df<,nc>) where
• x = number
• df = degrees of freedom
• If p=PROBCHI(x,df), then
p = P rob(χ2df ≤ x)
SAS Examples & Computations
p-values:
DATA probval;
pz=PROBNORM(1.96);
pzsq=PROBCHI(3.84,1);
output;
RUN;
Output:
pz pzsq
0.97500 0.95000
What are these values?

SAS Examples & Computations
. . . To get density values. . .

Probability Density;
data chisq3;
do x=0 to 10 by .005;
pdfxsq=pdf(’CHISQUARE’,x,3);
output;
end;
run;

Inferences about a Population Variance
or the sampling distribution of the sample
variance from a normal population.
• Statistical Hypotheses:
Ho : σ 2 = σo2 versus Ha : σ 2 6= σo2

• Assumptions: Observations are
independently drawn (random) from a normal
population; i.e.,
Yi ∼ N (µ, σ 2 ) i.i.d

2
Inferences about σ (continued)
Test Statistic:
n n
• We know
X (Yi − µ)2 X
= zi2 ∼ χ2n
i=1
σ2 i=1
if z ∼ N (0, 1).
• We don’t know µ, so we use Ȳ as an estimate
of µ Xn
(Yi − Ȳ )2 2
∼ χn−1
i=1
σ2
Pn
or i=1 (Y i − Ȳ )2
(n − 1)s 2
2
= ∼ χ n−1
σ2 σ2

2 2
Test Statistic for Ho : σ = σo
2
• So σ
s2 ∼ χ2n−1
(n − 1)
• This gives us our test statistic:

n
Ȳ )2
P
i=1 (Yi −
Xν2 =
σo2
where Ho : σ 2 = σo2 .
• Sampling distribution of Test Statistic: If Ho is
true, which means that σ 2 = σo2 , then
2
Pn 2
2 (n − 1)s i=1 (Y i − Ȳ ) 2
Xν = 2
= ∼ χ n−1
σo σo2
2 2
Decision and Conclusion, Ho : σ = σo
• Decision: Compare the obtained test statistic
to the chi-squared distribution with ν = n − 1
degrees of freedom.
or find the p-value of the test statistic and

compare to α.
• Interpretation/Conclusion: What does the
decision mean in terms of what you’re
investigating?

2 2
Example of Ho : σ = σo
• High School and Beyond: Is the variance of
math scores of students from private schools
equal to 100?
Ho : σ 2 = 100 versus Ha : σ 2 6= 100

• Assumptions: Math scores are independent
and normally distributed in the population of
high school seniors who attend private
schools and the observations are
independent.
2 2
Example of Ho : σ = σo (continued)
• Test Statistic: n = 94, s2 = 67.16, and set
α = .10.
2
2 (n − 1)s (94 − 1)(67.16)
X = 2
= = 62.46
σ 100
with ν = (94 − 1) = 93.
• Sampling Distribution of the Test Statistic:
Chi-square with ν = 93.
Critical values: .05 χ293 = 71.76 & .95 χ293 = 116.51.

2 2
Example of Ho : σ = σo (continued)
• Critical values: .05 χ2 = 71.76 & .95 χ2 = 116.51.
93 93
• Decision: Since the obtained test statistic X 2 = 71.76 is

less than .05 χ293 = 116.51, reject Ho at α = .10.
2
Confidence Interval Estimate of σ
• Start with
2
(n − 1)s
µ ¶
2 2
Prob χ
(α/2) ν ≤ 2
≤ χ
(1−α/2) ν =1−α
σ
• After a little algebra. . .

2
·µ ¶ µ ¶¸
1 σ 1
Prob 2
≤ 2
≤ 2
=1−α
(1−α/2) χν (n − 1)s (α/2) χν
• and a little more

2 2
(n − 1)s (n − 1)s
·µ ¶ µ ¶¸
2
Prob 2
≤σ ≤ 2
=1−α
(1−α/2) χν (α/2) χν

2
90% Confidence Interval Estimate of σ
• (1 − α)% Confidence interval,
(n − 1)s2 (n − 1)s2
2
≤σ≤ 2
χ
(1−α/2) ν χ
α/2 93
• So,
(94 − 1)(67.16) (94 − 1)(67.16)
, −→ (53.61, 87.04),
116.51 71.76
which does not include 100 (the null
hypothesized value).
• s2 = 67.16 isn’t in the center of the interval.
The F Distribution
• Comparing two variances: Are they equal?
• Start with two independent populations, each
normal and equal variances.. . .
Y1 ∼ N (µ1 , σ 2 ) i.i.d.
Y2 ∼ N (µ2 , σ 2 ) i.i.d.
• Draw two independent random samples from
each population,
n1 from population 1
n2 from population 2
The F Distribution (continued)
• Using data from each of the two samples,
estimate σ 2 .
s21 and s22
• Both S12 and S22 are random variables, and
their ratio is a random variable,
estimate of σ 2 s21
F = 2
= 2
estimate of σ s2
χ2(n1 −1) /(n1 − 1) χ2ν1 /ν1
= 2
= 2
χ(n2 −1) /(n2 − 1) χν2 /ν2
• Random variable F has an F distribution.

Testing for Equal Variances
• F gives us a way to test Ho : σ12 = σ22 (= σ 2 ).
• Test statistic:
1
Pn1 2
¡1¢
i=1 (Yi1 − Ȳ1 )
µ 2¶
s1 n1 −1 σ2
F = 2 = 1
Pn2 ¡1¢
s2 n2 −1 i=1 (Yi2 − Ȳ2 )2 σ2
χ2ν1 /ν1
= 2
χν2 /ν2
• A random variable formed from the ratio of
two independent chi-squared variables, each
divided by it’s degrees of freedom, is an
“F –ratio” and has an F distribution.
Conditions for an F Distribution
• IF
• Both parent populations are normal.
• Both parent populations have the same
variance.
• The samples (and populations) are
independent.
• THEN the theoretical distribution of F is Fν1 ,ν2
where
• ν1 = n1 − 1 = numerator degrees of freedom
• ν2 = n2 − 1 = denominator degrees of freedom

Eg of F Distributions: F2,ν2

Eg of F Distributions: F5,ν2

Eg of F Distributions: F50,nu2 . . .

Important Properties of F Distributions
• The range of F–values is non-negative real
numbers (i.e., 0 to +∞).
• They depend on 2 parameters: numerator
degrees of freedom (ν1 ) and denominator
degrees of freedom (ν2 ).
• The expected value (i.e, the mean) of a
random variable with an F distribution with
ν2 > 2 is
E(Fν1 ,ν2 ) = µFν1 ,ν2 = ν2 /(ν2 − 2).

Properties of F Distributions
• For any fixed ν1 and ν2 , the F distribution is
non-symmetric.
• The particular shape of the F distribution
varies considerably with changes in ν1 and ν2 .
• In most applications of the F distribution (at
least in this class), ν1 < ν2 , which means that
F is positively skewed.
• When ν2 > 2, the F distribution is uni-modal.

Percentiles of the F Dist.
• http://calculators.stat.ucla.edu/cdf
• p-value program
• SAS probf
• Tables textbooks given the upper 25th , 10th ,
5th , 2.5th , and 1st percentiles. Usually, the
• Columns correspond to ν1 , numerator df.
• Rows correspond to ν2 , denominator df.
• Getting lower percentiles using tables
requires taking reciprocals.

Selected F values from Table V
Note: all values are for upper α = .05
ν1 ν2 Fν1 ,ν2 which is also . . .
1 1 161.00 t21
1 20 4.35 t220
1 1000 3.85 t21000
1 ∞ 3.84 t2∞ = z 2 = χ21
ν1 ν2 Fν1 ,ν2
1 20 4.35
4 20 2.87
10 20 2.35
20 20 2.12
1000 20 1.57

Test Equality of Two Variances
Are students from private high schools more
homogeneous with respect to their math test
scores than students from public high schools?
2 2 2 2
Ho : σprivate = σpublic or σpublic /σprivate =1
2 2
versus Ha : σprivate < σpublic ,(1-tailed test).
• Assumptions: Math scores of students from private
schools and public schools are normally distributed
and are independent both between and within in
school type.
• Test Statistic: s2 91.74
1
F = 2= = 1.366
s2 67.16
with ν1 = (n1 − 1) = (506 − 1) = 505 and
ν2 = (n2 − 1) = (94 − 1) = 93.
Since the sample variance for public schools,
s21 = 91.74, is larger than the sample variance for
private schools, s22 = 67.16, put s21 in the numerator.
• Sampling Distribution of Test Statistic is
F distribution with ν1 = 505 and ν2 = 93.

• Decision: Our observed test statistic,
F505,93 = 1.366 has a p–value= .032. Since
p–value < α = .05, reject Ho .
• Or, we could compare the observed test
statistic, F505,93 = 1.366, with the critical value
of F505,93 (α = .05) = 1.320. Since the
observed value of the test statistic is larger
than the critical value, reject Ho .
• Conclusion: The data support the conclusion
that students from private schools are more
homogeneous with respect to math test
scores than students from public schools.
Example Continued
• Alternative question: “Are the individual
differences of students in public high schools
and private high schools the same with
respect to their math test scores?”
• Statistical Hypotheses: The null is the same,
but the alternative hypothesis would be
2 2
Ha : σpublic 6= σprivate (a 2–tailed alternative)
• Given α = .05, Retain the Ho , because our
obtained p–value (the probability of getting a
test statistic as large or larger than what we
got) is larger than α/2 = .025.
Example Continued
• Given α = .05, Retain the Ho , because our
obtained p–value (the probability of getting a
test statistic as large or larger than what we
got) is larger than α/2 = .025.
• Or the rejection region (critical value) would
be any F –statistic greater than
F505,93 (α = .025) = 1.393.
• Point: This is a case where the choice
between a 1 and 2 tailed test leads to different
decisions regarding the null hypothesis.

Test for Homogeneity of Variances
Ho : σ12 = σ22 = . . . = σJ2

• These include
• Hartley’s Fmax test
• Bartlett’s test
• One regarding variances of paired
comparisons.
• You should know that they exist; we won’t go
over them in this class. Such tests are not as
important as they once (thought) they were.
• Old View: Testing the equality of variances
should be a preliminary to doing independent
t-tests (or ANOVA).
• Newer View:
• Homogeneity of variance is required for small
samples, which is when tests of homogeneous
variances do not work well. With large samples, we
don’t have to assume σ12 = σ22 .
• Test critically depends on population normality.
• If n1 = n2 , t-tests are robust.

• For small or moderate samples and there’s
concern with possible heterogeneity −→
perform a Quasi-t test.
• In an experimental settings where you have
control over the number of subjects and their
assignment to groups/conditions/etc. −→
equal sample sizes.
• In non-experimental settings where you have
similar numbers of participants per group, t
test is pretty robust.

2
Relationship between z, tν , χν , and Fν1,ν2
. . . and the central importance of the normal
distribution.
• Normal, Student’s tν , χ2ν , and Fν1 ,ν2 are all
theoretical distributions.
• We don’t ever actually take vast (infinite)
numbers of samples from populations.
• The distributions are derived based on
mathematical logic statements of the form
IF . . . . . . . . . Then . . . . . . . . .

Derivation of Distributions
• Example
• IF we draw independent random samples of size
(large) n from a population and compute the mean
Ȳ and repeat this process many, many, many, many
times,
• THEN Ȳ is approximately normal.
• Assumptions are part of the “if” part, the conditions

used to deduce sampling distribution of statistics.
• The t, χ2 and F distributions all depend on normal
“parent” population.
Chi-Square Distribution
• χ2ν = sum of independent squared normal
random variables with mean µ = 0 and
variance σ 2 = 1 (i.e., “standard normal”
random variables).
n
X
χ2ν = zi2 where zi ∼ N (0, 1)
i=1
• Based on the Central Limit Theorem, the

“limit” of the χ2ν distribution (i.e., n → ∞) is
normal.
The F Distribution
• Fν1 ,ν2 = ratio of two independent chi-squared
random variables each divided by their
respective degrees of freedom.
χ2ν1 /ν1
Fν1 ,ν2 = 2
χν2 /ν2
• Since χ2ν ’s depend on the normal distribution,
the F distribution also depends on the normal
distribution.
• The “limiting” distribution of Fν1 ,ν2 as ν2 → ∞
is χ2ν1 /ν1 .. . . . . . because as ν2 → ∞,
χ2ν2 /ν2 → 1.
Students t Distribution
Note that
¶2
Ȳ − µ
µ
t2ν = √
s/ n
(Ȳ − µ)2 n
= Pn 2 /(n − 1)
i=1 (Y i − Ȳ )
2 1 ¶
(Ȳ − µ) n
µ
σ2
= Pn 2 /(n − 1) 1
i=1 (Y i − Ȳ ) σ2
(Ȳ −µ)2
σ 2 /n z2
= Pn =
i=1 (Yi −Ȳ )
2
χ2 /ν
σ 2 (n−1)

Students t Distribution (continued)
• Student’s t based on normal,
2
z z
t2ν = 2 or tν = p
χν /ν χ2ν /ν
• A squared t random variable equals the ratio
of squared standard normal divided by
chi-squared divided by its degrees of
freedom. So. . .

Students t Distribution (continued)
2
Since z z
t2ν = 2 or tν = p
χν /ν χ2ν /ν
• As ν → ∞, tν → N (0, 1) because χ2ν /ν → 1.
• Since z 2 = χ21 , 2 z 2
/1 χ 2
1 /1
t = 2 = 2 = F1,ν
χn /ν χn /ν
• Why are the assumptions of normality,
homogeneity of variance, and independence
required for
• t test for mean(s)
• Testing homogeneity of variance(s).

Summary of Relationships
Let z ∼ N (0, 1)
Distribution Definition Parent Limiting
2
Pν 2
χν z
i=1 i normal As ν → ∞,
independent z’s χ2ν → normal
Fν1 ,ν2 (χ2ν1 /ν1 )/(χ2ν2 /ν2 ) chi-squared As ν2 → ∞,
independent χ2 ’s Fν1 ,ν2 → χ2ν1 /ν1
p
t z/ χ2 /ν normal As ν → ∞,
t → normal
Note: F1,ν = t2ν , also F1,∞ = t2∞ = z 2 = χ21 .

7ChiSq Fdist 05 Online PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

7ChiSq Fdist 05 Online PDF

Uploaded by

Copyright:

Available Formats

Chi-Square & F Distributions

Chi-Square & F Distributions – p. 1/55

• Testing hypotheses about the variance of a single

• Testing the equality of variances of two independent

. . . and Inferences about Variances

Chi-Square & F Distributions – p. 3/55

• Does the variance in one population equal the

• Is the assumption of homogeneous variances

Chi-Square & F Distributions – p. 6/55

Chi-Square & F Distributions – p. 8/55

Chi-Square & F Distributions – p. 9/55

Chi-Square & F Distributions – p. 10/55

Chi-Square & F Distributions – p. 11/55

Chi-Square & F Distributions – p. 13/55

Why? The Central Limit Theorem (for sums):

Chi-Square & F Distributions – p. 14/55

Chi-Square & F Distributions – p. 15/55

What are these values?

Chi-Square & F Distributions – p. 18/55

. . . To get density values. . .

Chi-Square & F Distributions – p. 19/55

Ho : σ 2 = σo2 versus Ha : σ 2 6= σo2

Chi-Square & F Distributions – p. 20/55

Chi-Square & F Distributions – p. 21/55

• This gives us our test statistic:

or find the p-value of the test statistic and

Chi-Square & F Distributions – p. 23/55

Ho : σ 2 = 100 versus Ha : σ 2 6= 100

Critical values: .05 χ293 = 71.76 & .95 χ293 = 116.51.

Chi-Square & F Distributions – p. 25/55

• Decision: Since the obtained test statistic X 2 = 71.76 is

• After a little algebra. . .

• and a little more

Chi-Square & F Distributions – p. 27/55

• Random variable F has an F distribution.

• ν2 = n2 − 1 = denominator degrees of freedom

Chi-Square & F Distributions – p. 32/55

Chi-Square & F Distributions – p. 33/55

Chi-Square & F Distributions – p. 34/55

Chi-Square & F Distributions – p. 35/55

Chi-Square & F Distributions – p. 36/55

Chi-Square & F Distributions – p. 37/55

Chi-Square & F Distributions – p. 38/55

Chi-Square & F Distributions – p. 39/55

Chi-Square & F Distributions – p. 41/55

Chi-Square & F Distributions – p. 44/55

Ho : σ12 = σ22 = . . . = σJ2

• If n1 = n2 , t-tests are robust.

Chi-Square & F Distributions – p. 46/55

Chi-Square & F Distributions – p. 47/55

Chi-Square & F Distributions – p. 48/55

• Assumptions are part of the “if” part, the conditions

• Based on the Central Limit Theorem, the

Chi-Square & F Distributions – p. 52/55

Chi-Square & F Distributions – p. 53/55

• Testing homogeneity of variance(s).

Chi-Square & F Distributions – p. 55/55

You might also like