Professional Documents
Culture Documents
Watch lecture 1-1 and then complete this task before attempting the multiple choice questions
below.
The data below are from a study investigating genetic variation in individuals affected by
pancreatic cancer.
Researchers genotyped 5 variants in 1000 cases and 1000 controls. Observed allele frequencies for
each variant are given in the table below.
Calculate the allele frequencies for each variant in cases and controls
There are some SNPs that have slightly different allele frequencies between cases and controls but
does this just represent normal variation? It can be hard to tell by looking at the allele frequencies
alone. In order to test the significance, we need to do a Fisher’s exact test.
Go to https://www.graphpad.com/quickcalcs/contingency1/
Use the contingency table to test the significance for each SNP. “Group 1” will be the cases and
“Group 2” will be the controls. The calculator adjusts for total sample size for you so enter the allele
counts rather than the frequencies. “Outcome 1” will be the minor allele count (that is the least
frequent allele in your sample) and “Outcome 2” will be the major allele count (that is the most
common allele in your sample). Make sure that the Fisher’s exact test is selected and that you
perform a 2-tailed test (since the allele frequency in cases may be higher or lower than that of
controls). Record the p-value for each SNP.
Which SNPs are significantly associated with pancreatic cancer at a p-value threshold of 0.05?
What is the “risk allele” for each variant (i.e. the allele that is more common in cases than controls)?
In this experiment, we tested 5 variants. In order to test for multiple corrections, we can apply a
Bonferroni correction adjusting our nominal significance threshold of 0.05 to 0.01 (i.e. 0.05 ÷ 5).
rs9543325, rs2736098
What would the corrected p-value threshold be if we had tested these SNPs as part of a GWAs
including 1 million variants?
0.05/1000000 = 5x10-8
None