You are on page 1of 29

Chi-Squared ( 2) Analysis

AP Biology
What is Chi-Squared?
• In genetics, you can predict genotypes
based on probability (expected results)
• Chi-squared is a form of statistical analysis
used to compare the actual results
(observed) with the expected results
• NOTE:  2 is the name of the whole
variable – you will never take the square
root of it or solve for 
• In statistics, a result is
significant if it is unlikely
to have occurred by
chance.
• Because maybe you’re not
being a great scientist and
the independent variable
(the test condition being
examined) has no effect.
• Maybe you got results by
chance. If by chance, it
isn’t statistically
significant.
Chi-squared
• If the expected (probability) and observed
(actual) values are the same then the  2 = 0
• If the  2 value is 0 or is small then the data
fits your hypothesis (the expected values)
well and it is merely chance.
• By calculating the  2 value you determine
if there is a statistically significant
difference between the expected and actual
values.
Null hypothesis
• In statistics, a null hypothesis is a
hypothesis set up to be nullified or refuted
in order to support an alternative
hypothesis.
• When used, the null hypothesis (chance) is
presumed true until statistical evidence in
the form of a “hypothesis test” indicates
otherwise.
• So when you do a Chi-Square test you say “I
have a null hypothesis that I got this result in
my experiment because of chance”
• Now you don’t really think that. You are
setting out to reject the null hypothesis so you
can accept your alternative hypothesis.
• Your alternative hypothesis would be
something like, “This experimental result is
NOT because of chance it is because of the
variables I set up. I am a genius, my
experiment shows a significant difference and
is NOT by chance alone.”
Step 1: Calculating  2

• First, determine what your expected and


observed values are.
• Observed (Actual) values: That should be
something you get from data– usually no
calculations 
• Expected values: based on probability
(MATH )
• Suggestion: make a table with the expected
and actual values if more than one variable, if
one variable, just divide evenly for expected.
R r
R
r
RR
Rr
Rr
rr
Step 1: Example
• Observed (actual) values: Suppose you
have 90 tongue rollers and 10 nonrollers
• Expected: Suppose the parent genotypes
were both Rr  using a punnett square, you
would expect 75% tongue rollers, 25% non-
rollers (one variable)
• This translates to 75 tongue rollers, 25 non-
rollers (since the population you are dealing
with is 100 individuals)
Step 1: Example
• Table should look like this:
Expected Observed
(Actual)
Tongue rollers 75 90

Nonrollers 25 10
Step 2: Calculating  2

• Use the formula to calculated  2


• For each different category (genotype or
phenotype) calculate
(observed – expected)2 / expected
• Add up all of these values to determine  2
Step 2: Calculating  2
Expected Observed
(Actual)

Step 2: Example
Tongue
rollers
75 90
Non-
rollers 25 10
• Using the data from before:
• Tongue rollers
(90 – 75)2 / 75 = 3
• Non-rollers
(10 – 25)2 / 25 = 9
• 2 = 3 + 9 = 12
Step 3: Determining Degrees of
Freedom
• Degrees of freedom = # of categories – 1
• Ex. For the example problem, there were
two categories (tongue rollers and
nonrollers)  degrees of freedom = 2 – 1
• Degrees of freedom = 1
• Each new variable brings in new degrees
of freedom…..
• The significance of a result is also called its
p-value
• the smaller the p-value, the more significant
the result is said to be.
• We say that if the p-value is low enough that
we reject the null hypothesis and accept the
alternative hypothesis.
• “it wasn’t chance, it was the thing in my
experiment
• Popular levels of significance are 10%, 5%,
and 1% , all represented by the Greek symbol,
α (alpha). Remember you could write those as
0.10 0.05 and 0.01

• We use 5% a lot

• If the p-value is lower than 0.05 you can say


there is less than a 5% possibility that this was
chance. It was probably this other thing
DETERMINE WHERE YOUR CHI SQUARE
VALUE IS IN THE TABLE BELOW:

• To see what p value matches your Chi square value


• Compare your chi square value with those in the row that
corresponds to one degree of freedom.
Step 4: Critical Value
• Using the degrees of freedom, determine
the critical value using the provided table
• Df = 1  Critical value = 3.84

BIOLOGISTS GENERALLY REJECT THE NULL


HYPOTHESIS IF THE VALUE OF P IS LESS THAN 0.05.
• Since you got 12 in this example you would
reject the null as it is over the critical value.
What does this mean?
• Accepting the null hypothesis means that
the results of the experiment differ from the
expected only by chance.

• Thus the experimenter can conclude that the


tongue rolling is statistically Significant and
you aren’t a tongue roller just by chance…
must be another reason you are that way!
• (hint: it is genetics….)
Step 5: Conclusion
• If  2 > critical value…
there is a statistically significant difference
between the actual and expected values.
• If  2 < critical value…
there is a NOT statistically significant
difference between the actual and expected
values and it just happens by chance.
Step 5: Example
•  2 = 12 > 3.84
There is a statistically significant difference
between the observed and expected
population – not due to chance.
 People who roll their tongues ARE
different than people who don’t, and it isn’t
just the roll of the dice– something causes
it.
Example
Null Hypothesis: There is no statistical
difference between the number of males and
females enrolled in AP Biology.

One variable--- Male or Female


CHI SQUARE VALUE:
 If the null hypothesis is supported by analysis
• The assumption is that the number of males and females in this
class random.
 If the null hypothesis is not supported by analysis
• The deviation (difference) between what was observed and what the
expected values were is very far apart…something non-random
must be occurring….
• Possible explanations: career interests of males vs. females, work
ethic of males vs. females, …
Calculating the Chi Square Value
for my two classes
Observed Expected
(Actual) (Theoretical)
Males 21

Females 34
Calculating the Chi Square Value for my two classes
Observed Expected
(Actual) (Theoretical)
Males 21 27.5

Females 34 27.5

P= 0.05 and Degrees of freedom is 2-1=1, so if the value is more


than 3.841 we reject the null and it is not by chance-something is
statistically significant, if below 3.841, we accept the null and it is
by chance.
So do we accept the null or no?
• Girls=(34-27.5)2 =1.54
27.5
• Boys = (21-27.5)2 =1.54
27.5
• 1.54 +1.54=3.08 3.08<3.841
• Our X2 value is less than the critical value, so we
accept the null hypothesis and there is NO
statistical difference between the number of males
and females enrolled in AP Biology.
What if it is MORE than ONE
VARIABLE?!?
• I want to know if there is a significant
difference between the people who use a
new book vs. an old book that pass or fail.
(2 variables - old vs. new and pass vs. fail)
• Data: New book, 26 pass and 4 fail
• Old book, 22 pass and 9 fail
• Is it by chance or is the new book better?
• Can’t just divide anything here eh?
Other way to get the expected
value
• Expected chart (called the contingency
table)
• Row total x Column total *
Total people
Observed Pass Fail Total
New book 26 4 30
Old book 22 9 31
Total 48 13 61

*for each possibility in the variables


Observed Pass Fail Total
New book 26 4 30 Row total x Column total
Old book 22 9 31 Total people
Total 48 13 61

• New book passed=(30x48) =23.61


61
• New book failed=(30x13) =6.39
61
• Old book passed=(31x48) =24.39
61
• Old book failed=(31x13) =6.61
61
Observed Expected
26 23.61
4 6.39
22 24.39
9 6.61

Chi squared calculations:


• nbp=(26-23.61)2 =0.241 nbf=(4-6.39)2 =0.894
23.61 6.39

• obp=(22-24.39)2 =0.234 obf=(9-6.61)2 =0.864


24.39 6.61
• 0.241+0.894+0.234+0.864=2.233

• 4 possibilities so Degrees of freedom = 4-1=3


Still p of 0.05, but now critical value is 7.815

• 2.233<7.815 so we accept the null and there is no statistical


difference between the old book and using the new.

You might also like