You are on page 1of 3

Task 5-2 Association Task

Watch lecture 1-1 and then complete this task before attempting the multiple choice questions
below.
The data below are from a study investigating genetic variation in individuals affected by
pancreatic cancer.

Researchers genotyped 5 variants in 1000 cases and 1000 controls. Observed allele frequencies for
each variant are given in the table below.

Position (chr:bp, A allele C allele G allele T allele


SNP
hg38) count count count count
rs9854771 chr3:189790682 711 0 1289 0
rs2736098 chr5:1293971 0 1630 0 370
Cases rs206936 chr6:34335092 1274 0 726 0
rs763780 chr6:52236941 0 157 0 1843
rs9543325 chr13:73342491 0 733 0 1267

rs9854771 chr3:189790682 685 0 1315 0


rs2736098 chr5:1293971 0 1532 0 468

Controls rs206936 chr6:34335092 1311 0 689 0


rs763780 chr6:52236941 0 122 0 1878
rs9543325 chr13:73342491 0 827 0 1173

Calculate the allele frequencies for each variant in cases and controls

Position (chr:bp, A allele C allele G allele T allele


SNP
hg38) freq freq freq freq
rs9854771 chr3:189790682 0.356 0.000 0.645 0.000
rs2736098 chr5:1293971 0.000 0.815 0.000 0.185
Cases rs206936 chr6:34335092 0.637 0.000 0.363 0.000

rs763780 chr6:52236941 0.000 0.079 0.000 0.922


rs9543325 chr13:73342491 0.000 0.367 0.000 0.634
rs9854771 chr3:189790682 0.343 0.000 0.658 0.000

rs2736098 chr5:1293971 0.000 0.766 0.000 0.234


Controls rs206936 chr6:34335092 0.656 0.000 0.345 0.000

rs763780 chr6:52236941 0.000 0.061 0.000 0.939


rs9543325 chr13:73342491 0.000 0.414 0.000 0.587
Looking at these data, which SNPs do you think are associated with pancreatic cancer susceptibility?
What do you think the risk alleles are for these SNPs?

There are some SNPs that have slightly different allele frequencies between cases and controls but
does this just represent normal variation? It can be hard to tell by looking at the allele frequencies
alone. In order to test the significance, we need to do a Fisher’s exact test.

Go to https://www.graphpad.com/quickcalcs/contingency1/

Use the contingency table to test the significance for each SNP. “Group 1” will be the cases and
“Group 2” will be the controls. The calculator adjusts for total sample size for you so enter the allele
counts rather than the frequencies. “Outcome 1” will be the minor allele count (that is the least
frequent allele in your sample) and “Outcome 2” will be the major allele count (that is the most
common allele in your sample). Make sure that the Fisher’s exact test is selected and that you
perform a 2-tailed test (since the allele frequency in cases may be higher or lower than that of
controls). Record the p-value for each SNP.

SNP Position (chr:bp, hg38) p-value


0.4069
rs9854771 chr3:189790682
0.0002
rs2736098 chr5:1293971
0.2339
rs206936 chr6:34335092
0.0347
rs763780 chr6:52236941
0.0026
rs9543325 chr13:73342491

Which SNPs are significantly associated with pancreatic cancer at a p-value threshold of 0.05?

rs9543325, rs763780, rs2736098

Were these the SNPs you predicted above?

rs9543325, rs763780, rs2736098

What is the “risk allele” for each variant (i.e. the allele that is more common in cases than controls)?

Is the risk variant the major or minor allele?

rs9543325-T (major), rs763780-C (minor), rs2736098-C (major)

In this experiment, we tested 5 variants. In order to test for multiple corrections, we can apply a
Bonferroni correction adjusting our nominal significance threshold of 0.05 to 0.01 (i.e. 0.05 ÷ 5).

How many SNPs are significantly associated now?

rs9543325, rs2736098
What would the corrected p-value threshold be if we had tested these SNPs as part of a GWAs
including 1 million variants?

0.05/1000000 = 5x10-8

How many variants are significantly associated now?

None

You might also like