You are on page 1of 4

Statistics

 for  AP  Biology           Name_____________________________________________  


 
NULL  HYPOTHESIS  TESTING  WITH  CHI-­‐SQUARE  ANALYSIS  
 
 
Overview  
• The  chi-­‐square  statistic  (X2)  measures  the  magnitude  of  difference  between  observed  and  expected  
values.  
o Observed  values  (o)  are  data  collected  by  the  researcher,  also  known  as  the  dependent  
variable.  
o Expected  values  (e)  are  predicted  by  the  null  hypothesis.  
• In  other  words,  the  chi-­‐square  analysis  measures  how  well  your  data  fit  with  the  null  hypothesis.      
o You  determine  whether  or  not  the  observed  values  are  significantly  different  from  the  
expected  values.  
o Based  on  that,  you  either  “refute”  or  “fail  to  refute”  the  null  hypothesis.  
 
When  is  it  appropriate  to  use  chi-­‐square  (X2  )  analysis?  
• The  data  must  categorical,  not  numerical.    Observed  values  (o)  will  be  determined  by  counting  the  
number  of  individuals  in  each  category.    The  categories  correspond  to  the  independent  variable.  
 
Categorical  data   Numerical  data  
• Survival-­‐  alive  or  dead   • Length  of  rainbow  trout  in  centimeters  
• Coin  toss-­‐  heads  or  tails   • Core  body  temperature  in  ˚C  
• Roll  the  dice-­‐  1,  2,  3,  4,  5  or  6   • Scores  on  the  biology  exam  in  percent  
• Phenotype-­‐  dominant  or  recessive   • Consumption  of  soda  in  liters  per  day  
• Genotype-­‐  MM  or  Mm  or  mm   • Size  of  habitat  in  square  meters  
 
• Your  expected  values  (e)  must  be  large.    If  expected  values  are  near  or  below  5,  do  not  use  the  X2  test.  
• Use  only  whole  number  raw  data  to  calculate  the  X2-­‐value.    Do  NOT  use  ratios,  percentages,  or  mean  
values.    This  may  require  that  you  determine  the  expected  whole  number  raw  data  (e)  based  on  
predicted  ratios  before  you  calculate  the  X2  value.  
 
How  to  do  chi-­‐square  analysis:  
1. Form  a  null  hypothesis.    The  null  hypothesis  must  reliably  predict  numerical  data  based  on  
expected  probabilities.    Thus  the  null  hypothesis  is  the  source  of  your  expected  values.  
• Typically,  the  null  hypothesis  states  that  the  independent  variable  has  no  effect  on  the  
dependent  variable.    From  this,  you  can  predict  expected  values  for  your  experiment  based  
on  probabilities.  
• This  is  in  contrast  to  the  “research  hypothesis”  for  an  experiment,  which  predicts  
that  the  independent  variable  will  have  a  certain  effect  on  the  dependent  variable.    
For  example,  your  research  hypothesis  may  predict  that  as  you  increase  the  
independent  variable,  the  dependent  variable  will  also  increase.    However,  you  
could  not  reliably  predict  by  what  magnitude  the  dependent  variable  would  
increase  based  on  probabilities.    
• You  may  be  testing  whether  or  not  your  experimental  data  fit  a  prediction  made  by  a  
theoretical  model.    In  this  case,  your  null  hypothesis  is  the  same  as  the  theoretical  model  
and  the  theoretical  model  probabilities  are  used  to  predict  the  expected  values.  
• Example  1:    You  may  predict  that  the  results  of  a  monohybrid  genetic  cross  (Mm  x  
Mm)  will  fit  a  3:1  ratio  for  dominant  versus  recessive  phenotypes  based  on  
Mendelian  genetics.    The  null  hypothesis  would  be  based  on  the  Mendelian  ratios.  
• Example  2:    You  may  be  testing  whether  your  observations  fit  certain  claims  made  
by  a  manufacturer.    The  manufacturer’s  claims  act  as  the  null  hypothesis  and  
theoretical  model  on  which  to  base  the  expected  values.  
  9  
 
2. Do  the  experiment  and  collect  the  data.    These  are  your  observed  values.  
 
3. Make  a  table  to  organize  observed  and  expected  values  for  the  different  categories  and  subsequent  
calculations.      
• You  must  have  an  observed  (o)  and  expected  (e)  value  for  each  category.      
• The  sum  of  your  observed  values  must  equal  the  sum  of  your  expected  values.  
• Only  raw  whole  number  data  values  are  used  (no  ratios,  no  percentages,  no  means).  
 
Example  experiment:    Fill  in  the  information  below  and  the  table  based  on  the  directions  given  as  you  
read  through  the  steps  outlined.  
 
You  toss  a  coin  150  times  and  observe  that  you  get  heads  62  times  and  tails  88  times.      
 
Null  hypothesis:  
 
 
Independent  variable:  
 
 
Dependent  variable:  
 
 
  Chi-­‐square  Calculation  for  Coin  Toss  Experiment  
 
Category  1:   Category  2:    
 
heads   tails   Total  
     
Observed  (o)  
62   88   150  
     
Expected  (e)  
   
     
(o-­‐e)  
 
     
(o-­‐e)2  
 
     
(o-­‐e)2    
e  
 
 
4. Calculate  the  chi-­‐square  value  (X2)  for  the  data  using  the  formula  given  
to  the  right.    Record  each  step  in  the  table.  
• First  calculate  the  expected  values  (e)  for  each  category.  
• Then  find  (o-­‐e)  for  each  category.  
• Square  (o-­‐e)  for  each  category.  
• In  each  category,  divide  the  (o-­‐e)2  value  by  e.  
• Sum  (∑)  ,  or  add  together,  all  of  the  (o-­‐e)2/e  values.    This  is  the  X2-­‐value!  
 
Perform  the  calculations  for  the  coin  toss  experiment  and  record  your  work  in  the  table  above.  
 
5. Determine  the  degrees  of  freedom  for  your  experiment.    Degrees  of  freedom  =  n-­‐1  where  “n”  is  
equal  to  the  number  of  categories.      
 
Degrees  of  freedom  for  the  coin  toss  experiment:  
  10  
 
 
6. Use  the  chi-­‐square  probability  table  to  find  the  probability  value,  also  known  as  the  p-­‐value,  based  
on  the  calculated  X2-­‐value  and  the  degrees  of  freedom.      
 
What  is  the  p-­‐value  for  the  coin  toss  experiment  (give  the  range):      
 

Table  of  probability-­‐values  


for  chi-­‐square  analysis.  
     
Based  on  the  degrees  of  
freedom  (column  1)  and  
calculated  X2-­‐value,  find  the  
corresponding  p-­‐value  
(Probability,  row  1).  
 

 
 
Study  the  chi-­‐square  probability  table  above.    State  the  relationship  among  the  magnitude  of  difference  
between  the  observed  and  expected  values,  the  X2-­‐value,  and  the  p-­‐value:  
 
 
 
 
7. Always  report  the  p-­‐value  with  respect  to  the  critical  value.      
• The  critical  value  represents  our  “cut  off”  for  whether  or  not  we  consider  the  difference  between  
two  data  sets  to  be  significant.  
• The  critical  value  of  “p”  in  science  is  p  =  0.05.      
• Report  either  p  <  0.05  or  p  >  0.05.  
 
How  would  you  report  the  p-­‐value  for  the  coin  toss  experiment?  
 
 
What  does  the  p-­‐value  mean?  
 
• It  is  unlikely  that  the  observed  data  will  exactly  fit  with  the  expected  data,  even  if  the  null  
hypothesis  is  correct  and/or  the  observed  data  do  fit  the  theoretical  model.    There  is  inherent  
variability  due  to  random  chance  and  this  leads  to  slight  deviations  in  the  data.    The  question  is,  
how  much  can  the  observed  results  deviate  from  the  expected  results  and  still  be  considered  to  be  
consistent  with  the  expected  results?  
 
• “p”  is  the  probability  or  likelihood  that  the  difference  between  the  observed  results  and  the  
expected  results  is  due  to  random  chance.    
   
o If  p<0.05,  then  there  is  only  a  5%  or  less  chance  that  the  deviation  of  the  observed  from  the  
expected  results  is  due  to  random  chance.      
§ This  means  the  observed  data  are  very  different  from  the  data  predicted  by  your  
null  hypothesis.      

  11  
§ Your  final  conclusion:    The  difference  between  the  observed  and  expected  
data  is  significant.    Reject  the  null  hypothesis.    Consider  an  alternative  
hypothesis.  
 
o If  p>0.05,  then  there  is  a  greater  than  5%  chance  that  the  deviation  of  the  observed  from  
the  expected  results  is  due  to  random  chance.      
§ This  means  the  observed  data  are  not  different  from  the  data  predicted  by  your  
null  hypothesis.      
§ Your  final  conclusion:    The  difference  between  the  observed  and  expected  
data  is  not  significant.    You  cannot  reject  (or  fail  to  reject)  the  null  
hypothesis.  
 
8. Based  on  the  X2-­‐value,  degrees  of  freedom,  and  p-­‐value,  state  a  conclusion  for  your  experiment.  
 
o The  preciseness  in  the  wording  of  your  final  conclusion  is  very  important.  
§ NEVER  say:  “the  data  are  significant”  or  “the  data  are  insignificant”.    You  conducted  a  
well-­‐designed  experiment.    Your  data  are  meaningful  and  significant.  Chi-­‐square  
analysis  does  not  test  this.  
§ DO  say:  “the  DIFFERENCE  between  the  observed  and  expected  data  IS  SIGNIFICANT”  
or  “the  DIFFERENCE  between  the  observed  and  expected  data  IS  NOT  SIGNIFICANT”.    
Chi-­‐square  analysis  does  test  this.  
§ NEVER  say  that  the  null  hypothesis  is  “supported”  or  “accepted”  or  “proven”.      
§ DO  say  that  you  either  “REJECT”  or  “FAIL  TO  REJECT”  the  null  hypothesis.      
 
o In  your  final  conclusion  you  must  address:  
§ Are  the  observed  values  significantly  different  from  the  expected  values?  
§ Give  evidence  by  stating  the  X2-­‐value,  degrees  of  freedom,  and  p-­‐value.  
§ State  whether  or  not  you  “refute”  or  “fail  to  refute”  the  null  hypothesis.  
§ If  you  “fail  to  refute”  the  null  hypothesis,  make  a  statement  about  your  alternative  
(research)  hypothesis.    Should  you  consider  the  alternative  hypothesis  you  stated?    Or  
should  a  different  alternative  hypothesis  be  considered?  
§ Use  language  that  reminds  the  reader  what  it  was  you  were  studying  in  the  first  place  
(ex.  salt  concentration  and  effect  on  plant  growth  if  that  is  what  you  were  testing)  
 
State  a  conclusion  for  the  coin  toss  being  sure  to  address  the  points  above:  
 
 
 
 
 
 
 
 
 
 
*On  the  AP  Exam,  you  will  be  provided  with  
the  version  of  the  Chi-­‐square  probability  
table  pictured  on  the  right.    You  will  be  
expected  to  know  that  the  critical  value  for  
“p”  is  0.05  and  that  values  of  p<0.05  are  
significant  while  values  of  p>0.05  are  
insignificant.    Be  very  familiar  with  this  
version  of  the  table.  

  12  

You might also like