Ajmg B 32829

Received: 23 April 2020 Revised: 17 November 2020 Accepted: 27 November 2020
DOI: 10.1002/ajmg.b.32829
RESEARCH ARTICLE
An integrative systems-based analysis of substance use:

eQTL-informed gene-based tests, gene networks, and
biological mechanisms
Zachary F. Gerring1 | Angela Mina Vargas1 | Eric R. Gamazon2,3,4 |

1
Eske M. Derks
1
Translational Neurogenomics Laboratory,
QIMR Berghofer Medical Research Institute, Abstract
Brisbane, Queensland, Australia Genome-wide association studies have identified multiple genetic risk factors under-
2
Division of Genetic Medicine, Department of
lying susceptibility to substance use, however, the functional genes and biological
Medicine, Vanderbilt University Medical
Center, Nashville, Tennessee mechanisms remain poorly understood. The discovery and characterization of risk
3
Vanderbilt Genetics Institute, Vanderbilt genes can be facilitated by the integration of genome-wide association data and gene
University Medical Center, Nashville,
Tennessee expression data across biologically relevant tissues and/or cell types to identify genes
4
Clare Hall, University of Cambridge, whose expression is altered by DNA sequence variation (expression quantitative trait
Cambridge, UK
loci; eQTLs). The integration of gene expression data can be extended to the study of
Correspondence genetic co-expression, under the biologically valid assumption that genes form co-
Zachary F. Gerring, Translational
expression networks to influence the manifestation of a disease or trait. Here, we
Neurogenomics Laboratory, QIMR Berghofer
Medical Research Institute, 300 Herston Road, integrate genome-wide association data with gene expression data from 13 brain tis-
Brisbane, QLD, Australia.
sues to identify candidate risk genes for 8 substance use phenotypes. We then test
Email: zachary.gerring@qimrberghofer.edu.au
for the enrichment of candidate risk genes within tissue-specific gene co-expression
Funding information
networks to identify modules (or groups) of functionally related genes whose dys-
National Institutes of Health, Grant/Award
Numbers: R01HG011138, R35HG010718; regulation is associated with variation in substance use. We identified eight gene
National Human Genome Research Institute
modules in brain that were enriched with gene-based association signals for sub-
stance use phenotypes. For example, a single module of 40 co-expressed genes was
enriched with gene-based associations for drinks per week and biological pathways
involved in GABA synthesis, release, reuptake and degradation. Our study demon-
strates the utility of eQTL and gene co-expression analysis to uncover novel biologi-
cal mechanisms for substance use traits.
KEYWORDS
gene co-expression, gene networks, genome-wide association study, substance use
1 | I N T RO DU CT I O N moderately heritable with a highly polygenic background, where hun-

dreds to thousands of genetic variants contribute to disease risk.
Substance use is linked to hundreds of diseases and adverse societal Genome-wide association studies (GWAS) have identified hundreds
outcomes (James et al., 2018). A reduction in the prevalence of sub- of genomic regions that contain genetic risk variants (or single nucleo-
stance use will therefore not only reduce the global burden of disease tide polymorphisms [SNPs]) robustly associated with substance use
but reduce costs to individual sufferers, their families, and society. traits, including, for example, alcohol use disorder (Sanchez-Roige
Substance use encompasses a range of behaviors (e.g., alcohol con- et al., 2018) and dependence (Walters et al., 2018) (SNP heritability
sumption, tobacco smoking, and cannabis use), each of which is [h2SNP]: 9–12%), tobacco smoking (Liu et al., 2019) (h2SNP: 1–4%), and
162 © 2020 Wiley Periodicals LLC wileyonlinelibrary.com/journal/ajmgb Am J Med Genet. 2021;186B:162–172.

GERRING ET AL. 163
cannabis use (h2SNP: 11%) (Pasman et al., 2018). However, the func- While single-eQTL approaches have improved the functional anno-
tional interpretation of these regions remains largely unknown due in tation of individual SNPs, more recent approaches combine eQTL infor-
part to the complex local correlation structure of the genome (linkage mation across multiple SNPs in close proximity to a gene. These
disequilibrium) and complex interaction patterns between genes, methods either impute gene expression levels using a reference dataset
known as the “co-localization problem” (Gamazon, Zwinderman, Cox, (Gamazon et al., 2015; Gusev et al., 2016) or incorporate eQTL infor-
Denys, & Derks, 2019), making causal gene identification challenging. mation within a gene-based test (Gerring, Mina-Vargas, & Derks, 2019).
Single nucleotide polymorphisms, or genetic variants, may affect the We recently developed a gene-based test called eMAGMA, which per-
expression of one or more genes or a broader network of genes forms gene-level testing by combining GWAS summary statistics,
within a disease relevant tissue or cell type (Gamazon et al., 2018). tissue-specific cis-eQTL information, and reference linkage disequilib-
We and others have linked genetic variants to changes in gene rium data (Gamazon et al., 2019). eMAGMA and other gene-based
expression, known as expression quantitative trait loci (eQTL), to iden- approaches, such as S-PrediXcan which imputes genetically regulated
tify individual risk genes as well as groups of correlated genes (risk gene expression levels using GWAS summary statistics, are more pow-
modules) for mental health (Gerring, Gamazon, & Derks, 2019) and erful than single-eQTL annotation (Fryett, Inshaw, Morris, &
substance use disorders (Marees et al., 2019). The advantage of this Cordell, 2018) and may integrate tissue-specific gene expression infor-
approach is co-expressed genes can be causal for a trait without being mation for the discovery of pathogenic and/or surrogate tissues. For
influenced by the same genetic variant, thereby increasing the geno- example, a gene-based analysis of six substance use traits reported
mic search space for higher-order biological associations. In the pre- altered genetically regulated gene expression in case samples, with
sent study, we will extend our earlier work (Marees et al., 2019) by many candidate risk genes either unique to brain or whole blood
generating gene co-expression modules characterized by correlated (Marees et al., 2019). These results suggest many regulatory effects for
levels of gene expression. We will subsequently test for the enrich- substance use traits manifest in a subset of disease-relevant tissues
ment of GWAS signals of eight substance use traits within these co- such as brain, however some effects may be shared across tissues and
expression modules. detected in other tissues with larger eQTL reference set samples sizes,
Different methods exist to integrate genetic and transcriptomic such as whole blood. By collapsing multiple SNPs to individual function-
information with a primary distinction between studies that use ally relevant genes, these approaches also facilitate the identification of
single-variant approaches (i.e., evaluating the impact of a single variant shared mechanisms underlying substance use traits; gene-based ana-
on gene expression) (Hormozdiari et al., 2016) versus gene-based lyses of lifetime cannabis use (Pasman et al., 2018) and alcohol con-
approaches that combine information across multiple SNPs sumption (Clarke et al., 2017) both identified CADM2 as a candidate
(i.e., imputation of gene expression at a gene-based level) (Gamazon risk gene, suggesting shared mechanisms underlying these traits.
et al., 2015; Gusev et al., 2016). Irrespective of the applied methodol- Genetic studies suggest substance use is highly polygenic; many
ogy, eQTL analyses are usually based on reference datasets in which genes are likely to interact with one-another in complex tissue- or
genetic and transcriptomic information has been collected in disease- cell-type specific networks influence substance use risk. Gene co-
relevant tissues. For example, Genotype-Tissue Expression (GTEx) expression network analysis describes the relationship between genes
project (version 7) contains genotype data linked to gene expression in terms of their pairwise correlation, where highly correlated genes
across 53 tissues from 714 donors, including 13 brain tissues from may share a functional relationship (i.e., highly correlated genes are
216 donors. GTEx and other tissue-specific eQTL datasets represents likely to be involved in the same biological process). A genetic pertur-
a valuable resource with which to study gene expression and its rela- bation that affects the expression of a single gene within co-
tionship with genetic variation (eGTEx Project, 2017). expression network may therefore alter the activity of a wider set of
The integration of genetic variation and tissue-specific gene genes. We recently applied this heuristic in a gene co-expression net-
expression data has been used to prioritize functional gene candidates work analysis of major depression, and identified novel gene candi-
for substance use traits in disease-relevant tissues (i.e., brain tissue). dates and gene modules both associated with major depression and
For example, a secondary analysis of a nicotine dependence GWAS disease-relevant biological pathways (Gamazon et al., 2019).
found an intronic SNP that regulates the expression of DNMT3B in In the present study, we aim to improve our understanding of the
brain (Hancock et al., 2018), while a similar analysis of cannabis biological mechanisms underlying eight substance use phenotypes by
dependence found genetic variation associated with the expression of exploring associations with co-expression networks derived from
CHRNA2 in brain (Demontis et al., 2019). In many instances, the func- human brain samples. First, to identify candidate causal genes, we will
tional gene candidate was not the gene closest to the associated risk integrate GWAS summary statistics for each phenotype with eQTL
variant; a GWAS of alcohol consumption, for example, identified risk information from brain tissues in GTEx using a novel gene-based
variants within the gene KLB that affected the expression of two dis- method called eMAGMA (Gerring et al., 2019). Second, we will
tantly located genes RCF1 and RPL9 (Clarke et al., 2017). Indeed, asso- explore the gene-based overlap of associations across substance use
ciations in which the nearest gene is not the functional candidate is phenotypes. Finally, we will use a gene co-expression network analy-
widespread in substance use traits, where some 66% of trait associ- sis to identify modules of genes enriched with gene-based association
ated eQTLs in GTEx targeted genes other than their closest gene signals, before using biological pathway analysis to characterize the
(Marees et al., 2019). substance use risk modules.
164 GERRING ET AL.
2 | METHODS near target genes based on significant (FDR < 0.05) SNP-gene associ-
ations in GTEx. Gene-based statistics were subsequently computed
2.1 | The GTEx project using the sum of the assigned SNP −log(10) P values while accounting
for Linkage Disequilibrium. S-PrediXcan, on the other hand, imputes
Fully processed, filtered and normalized gene expression data for genetically-regulated gene expression from training models to esti-
13 brain tissues (Table S1) were downloaded from the Genotype- mate the phenotype-expression association, while also controlling for
Tissue Expression project portal (version 7) (http://www.gtexportal. Linkage Disequilibrium. For both approaches, we used gene expres-
org). Only genes with ten or more donors were included. Other inclusion data for 13 brain tissues generated from GTEx (version 7) (Wang
sion criteria for expressed genes were expression estimates >0.1 et al., 2018), and LD information from the 1000 Genomes Project
Reads Per Kilobase of transcript (RPKM) and an aligned read count of Phase 3 (Delaneau, Marchini,, & Consortium, 2014). For each tissue,
six or more within each tissue. Within each tissue, the distribution of we corrected for multiple testing using Bonferroni correction based
RPKMs in each sample was quantile-transformed based on the aver- on the number of genes per tissue (Table S1). Due to correlated
age empirical distribution observed across all samples. Expression expression across tissues, no correction for the number of phenotypes
levels for each gene in each tissue were subsequently transformed to studied (N = 8) was performed.
the quantiles of the standard normal distribution.
2.4 | Fine-mapping of causal gene sets

2.2 | GWAS of substance use traits
S-PrediXcan and other transcriptomic approaches may yield false-
We downloaded GWAS summary statistics for eight substance use positive gene-trait associations due to correlation (LD) among SNPs
traits (smoking age of initiation [AOI], cigarettes per day [CGD], drinks used to generate the eQTL weights in the predication models
per week [DPW], smoking cessation [SMC], smoking initiation, alcohol (Mancuso et al., 2019). We used fine-mapping of causal gene sets
use disorder, alcohol dependence [ALD], and lifetime cannabis use) (FOCUS) to appropriately model the impact of gene-trait correlations
listed in Table 1. Detailed methods, including a description of popula- on the S-PrediXcan expression weights and assign a causal probability
tion cohorts, quality control of raw SNP genotype data, and associa- to each gene within substance use risk loci (Mancuso et al., 2019).
tion analyses for substance use GWAS are provided in their FOCUS gene expression weights were built from 13 brain tissues in
respective publications (Liu et al., 2019; Pasman et al., 2018; Sanchez- GTEx (version 7) (https://github.com/bogdanlab/focus/), and we used
Roige et al., 2018; Walters et al., 2018). LD information from the 1000 Genomes Project Phase 3 (Delaneau
et al., 2014) as reference genotypes.
2.3 | eQTL-informed gene-level analysis of

substance use GWAS signals 2.5 | Identification of gene expression modules
We identified and prioritized risk genes for each substance use phe- Gene co-expression modules were constructed for 13 individual brain
notype using eMAGMA (version 1.08) (Gerring et al., 2019) and S-Pre- tissues using the weighted gene co-expression network analysis
diXcan, both of which integrate GWAS summary statistics with eQTL (WGCNA) package in R (Langfelder & Horvath, 2008). An unsigned
information from the GTEx project. eMAGMA assigns SNPs within or pairwise correlation matrix—using Pearson's product moment
TABLE 1 GWAS summary statistics of substance use phenotypes
Phenotype Total N Cohort N loci Reference

Smoking age of initiation 341,427 GSCAN1 10 Liu et al. (2019)
Cigarettes per day 337,334 GSCAN 40 Liu et al. (2019)
Drinks per week 941,280 GSCAN 81 Liu et al. (2019)
Smoking cessation 547,219 GSCAN 16 Liu et al. (2019)
Smoking initiation 1,232,091 GSCAN 259 Liu et al. (2019)
AUDIT 121,604 UKB 10 Sanchez-Roige et al. (2018)
Alcohol dependence 46,568 PGC 1 Walters et al. (2018)
Lifetime cannabis use 162,082 ICC/UKB/23andMe 8 Pasman et al. (2018)
Note: Total N includes the number of cases and controls (for binary traits). All GWAS included individuals of European decent, with the exception of the
study by Walters et al. (2018), which included individuals of European and African descent.
Abbreviations: ICC, International Cannabis Consortium; GSCAN, GWAS and Sequencing Consortium of Alcohol and Nicotine; UKB, UK Biobank.
GERRING ET AL. 165
correlation coefficient—was calculated. An appropriate “soft- 2.8 | Preservation of gene co-expression networks
thresholding” value, which emphasizes strong gene–gene correlations across tissues
at the expense of weak correlations, was selected for each tissue by
plotting the strength of correlation against a series (range 2–20) of To examine the tissue-specificity of modular enrichments and biological
soft threshold powers. The correlation matrix was subsequently trans- pathways, we assessed the preservation (i.e., replication) of network
formed into an adjacency matrix. Matrices are characterized by nodes modules across GTEx brain tissues using the “modulePreservation” R
(corresponding to genes) and edges (corresponding to the connection function implemented in WGCNA (Langfelder, Luo, Oldham, &
strength between genes). Each adjacency matrix was normalized using Horvath, 2011). Briefly, the module preservation approach takes as
a topological overlap function. Hierarchical clustering was performed input “reference” and “test” network modules and calculates statistics
using average linkage, with one minus the topological overlap matrix for three preservation classes: (a) density-based statistics, which assess
as the distance measure. The hierarchical cluster tree was cut into the similarity of gene–gene connectivity patterns between a reference
gene modules using the dynamic tree cut algorithm (Langfelder, network module and a test network module; (b) separability-based sta-
Zhang, & Horvath, 2008), with a minimum module size of 30 genes. tistics, which examine whether test network modules remain distinct in
We amalgamated modules if the correlation between their reference network modules; and (c) connectivity-based statistics, which
eigengenes—defined as the first principal component of their genes' are based on the similarity of connectivity patterns between genes in
expression values—was greater or equal to 0.8. the reference and test networks. We report the “Z-summary” statistic
as a measure of preservation. A Z-summary value greater than 10 sug-
gests there is strong evidence a module is preserved between the refer-
2.6 | Gene-set analysis of gene co-expression ence and test network modules, while a value between 2 and
modules 10 indicates weak to moderate preservation and a value less than 2 indi-
cates no preservation.
To identify gene co-expression modules enriched with substance risk
genes, we performed gene-set analysis of eMAGMA gene-level results
in the derived tissue-specific gene co-expression modules using the 3 | RE SU LT S
gene-set analysis function in MAGMA v1.08 (de Leeuw, Mooij, Heskes, &
Posthuma, 2015; Gerring et al., 2019). The competitive analysis tests 3.1 | Study cohorts
whether the genes in a gene-set (i.e., gene co-expression module) are
more highly associated with risk genes than other genes while account- The substance use phenotypes included in our study are presented in
ing for gene size and gene density. We applied an adaptive permutation Table 1. The GSCAN (GWAS and Sequencing Consortium of Alcohol
procedure (de Leeuw et al., 2015) (N = 10,000 permutations) to obtain and Nicotine use) analysis of 5 substance use phenotypes in 1.2 mil-
p values corrected for multiple testing. The 1000 Genomes European lion individuals contributed the largest number of significant loci
reference panel (Phase 3) was used to account for Linkage Disequilibrium (566 variants in 406 loci) for our study. All of the included studies,
between SNPs. For each tissue and gene-based enrichment method, a with the exception of ALD from the Psychiatric Genomics Consor-
quantile–quantile plot of observed versus expected p values was gener- tium, used samples derived from the UK Biobank and/or 23andMe.
ated to assess inflation in the test statistic. Over half of the significant loci across the eight phenotypes were
related to smoking initiation, which contained the largest number of
samples (N = 1,232,091).
2.7 | Characterization of gene expression modules
Gene expression modules enriched with substance use GWAS associ- 3.2 | Gene-based tests of association
ation signals were assessed for enrichment of biological pathways and
processes using g:Profiler (https://biit.cs.ut.ee/gprofiler/) (Reimand To identify genes whose expression is influenced by genetic variation
et al., 2016). Ensembl gene identifiers within substance use gene mod- underlying disease risk, we performed eMAGMA using GWAS sum-
ules were used as input; we tested for the over-representation of mary statistics and gene expression information from 13 brain tissues
module genes in Gene Ontology (GO) biological processes. We set in GTEx v7 (Tables 2 and S2). We identified 276 unique gene-based
the statistical domain scope (i.e., reference gene set) to “only anno- associations across all brain tissues (after Bonferroni correction for
tated genes.” The g:Profiler algorithm uses a Fisher's one-tailed test the number of genes in each tissue) (Table S2). The number of signifi-
for gene pathway enrichment; the smaller the p value, the lower the cant genes for each phenotype was a function of GWAS sample size;
probability a gene belongs to both a co-expression module and a bio- 124 genes in 13 brain tissues associated with smoking initiation
logical term or pathway purely by chance. Multiple testing correction (GWAS N samples = 1,232,091), while a single significant gene was
was done using g:SCS; this approach accounts for the correlated associated with ALD (GWAS N samples = 46,568).
structure of GO terms and biological pathways, and corresponds for There was no overlap in significant eMAGMA associations across
an experiment-wide threshold of α = .05. all phenotypes, and only modest overlap between phenotype pairs.
166 GERRING ET AL.
TABLE 2 Number and top associated eMAGMA association across substance use phenotypes
Phenotype N Top gene Chr p value Tissue

−15
Smoking age of initiation 9 BORCS5 10 5.77 × 10 Spinal cord cervical C-1
Cigarettes per day 46 CHRNA5 15 5.00 × 10−16 Caudate basal ganglia
−15
Drinks per week 92 FUT2 2 2.11 × 10 Anterior cingulate cortex
Smoking cessation 30 CHRNA4 20 7.57 × 10−10 Cerebellum
−15
Smoking initiation 124 BORCS7 10 5.77 × 10 Spinal cord cervical C-1
Alcohol use disorder 31 NDUFS3 11 3.54 × 10−10 Cerebellar hemisphere
−6
Alcohol dependence 1 ADH1C 4 5.46 × 10 Spinal cord cervical C-1
Lifetime cannabis use 9 SMG6 17 3.62 × 10−8 Cerebellum
Note: N denotes the number of significant genes. The eMAGMA analysis was restricted to 13 brain related tissues in GTEx.
TABLE 3 Overlap in the number of significant gene-based (eMAGMA) associations across substance use phenotypes
Significant Smoking age Cigarettes Drinks Smoking Smoking Alcohol use Alcohol Lifetime
genes (N) of initiation per day per week cessation initiation disorder dependence cannabis use
Smoking age 9
of initiation
Cigarettes per day 46 1
Drinks per week 92 0 0
Smoking cessation 30 0 11 4
Smoking initiation 124 1 7 6 8
Alcohol use disorder 31 0 0 28 3 2
Alcohol dependence 1 0 0 1 0 0 0
Lifetime cannabis use 9 0 0 3 0 3 2 0
For example, 28 genes were significantly associated with both alcohol r = .253, p < 2.22 × 10−16) and smoking initiation and DPW (r = .215,
use disorder and the number of DPW (Table 3). There was a high cor- p = 1.72 × 10−13).
relation between the number of samples for each tissue and signifi-
cant gene-based associations (Pearson's r = .87). Cerebellum
accounted for the largest number of significant associations 3.3 | Fine-mapping further prioritizes genes within
(N associations = 135) and also contained the largest number of post- GWAS risk loci
mortem brain samples (N samples = 154). We compared the number
of significant associations from the eMAGMA analysis with previous We applied the fine-mapping of causal gene sets (FOCUS) algorithm
findings from conventional MAGMA and S-PrediXcan (Table S3). The to prioritize genes within substance use GWAS risk loci. All of the
total number of eMAGMA associations is smaller than the number of phenotypes, with the exception of ALD, contained “credible” genes
significant conventional MAGMA associations, but larger than the (that is, genes most likely to be causal for a given phenotype). We
number of S-PrediXcan associations. A total of 143 unique genes identified a total of 269 unique credible genes across 77 distinct loci
were significant using eMAGMA associations but not conventional for 7 substance use phenotypes. Smoking initiation had the largest
MAGMA or S-PrediXcan (Table S4). number of loci with credible genes (N = 42 loci containing 117 credible
The gene CADM2, which has been linked to behavioral under- genes), followed by CGD (N = 19 loci containing 46 credible genes).
control, was associated with 4 substance use phenotypes (DPW, alco- Candidate casual genes with the highest posterior inclusion probabil-
hol use disorder, smoking initiation, and cannabis use). Furthermore, ity (PIP) included FPGT (S-PrediXcan Z score − 6.33; PIP: 1) for
the effect direction of CADM2 was consistent across phenotypes smoking initiation; ZNF780B (S-PrediXcan Z score 5.37; PIP 1) for
(Table S5). Another four genes (AMT, CHRNA2, GPX1, KANSL1) were SMC; RFC1 (S-PrediXcan Z score − 9.41; PIP: 1) for DPW; SNRPA (S-
significant across three phenotypes (CGD, AOI, and SMC (Table 4). PrediXcan Z score − 9.44; PIP: 1) for CGD; CADM2 (S-PrediXcan
Overall, we found moderate correlation of eMAGMA Z-scores Z score 4.38; PIP: 0.624) for lifetime cannabis use; GRK4 (S-PrediXcan
between phenotype pairs (Table S6 and Figure S1), with the strongest Z score − 4.7; PIP: 0.542) for AOI; and FAM180B for alcohol use dis-
correlations between alcohol use disorder and DPW (Pearson's order (S-PrediXcan Z score − 5.74; PIP: 0.749). A full list of credible
GERRING ET AL.
TABLE 4 Cross-tabulation of significant gene-based (eMAGMA) associations across substance use phenotypes
Smoking
age of Cigarettes Smoking Alcohol use Alcohol Lifetime
initiation per day Drinks per week Smoking cessation initiation disorder dependence cannabis use
Smoking age
of initiation
Cigarettes per day GRK4
Drinks per week 0 0
Smoking cessation 0 C19ORF54; CHRNA4; CSDC2; NEGR1; DNAJB7; POLR3H
P4HTM; NCKIPSD;
WDR6; CCDC71;
DALRD3; AMT;
QRICH1; GPX1;
XRCC3
Smoking initiation CUL3 P4HTM; NCKIPSD; KANSL1; CADM2; CPSF4; GPX1; AMT; NCKIPSD;
WDR6; CCDC71; ZKSCAN5; ADAM15; NOB1 P4HTM; DALRD3;
DALRD3; AMT; GPX1 WDR6; CCDC71;
HIST1H1C
Alcohol use disorder 0 0 FUT2; KANSL1; CRHR1; LRRC37A; DNAJB7; EP300; KANSL1; CADM2
PLEKHM1; MAMSTR; ARL17A; ZC3H7B
ARL17B; ARHGAP27; SPPL2C;
NDUFS3; FMNL1; MAPT;
LRRC37A2; NTN5; C1QTNF4;
RFC1; TUFM; PSMC3;
SLC39A13; MTCH2; FAM180B;
CADM2; ZNF513; NSF; TAT;
ARHGAP1; DNAJB7
Alcohol dependence 0 0 ADH1C 0 0 0
Lifetime cannabis use 0 0 SULT1A2; TUFM; CADM2 0 SRR; SMG6; CADM2; TUFM 0
CADM2
167
168 GERRING ET AL.
TABLE 5 Tissue-specific gene co-expression modules enriched with gene-based association signals for substance use phenotypes
Module N genes p (adjusted) Tissue Biological process

AOI-1 260 .0018 Putamen basal ganglia Nucleic acid metabolic process
Gene expression (transcription)
AOI-2 75 3.9 × 10−5 Spinal cord cervical C-1 Metabolism of RNA
RNA processing
ALD-1 1,155 .0029 Anterior cingulate cortex BA24 Nervous system development
Neuronal system
CAN-1 35 .0018 Cerebellar Hemisphere Trans-synaptic signaling
Neuronal system
DPW-1 40 .0003 Anterior cingulate cortex BA24 GABA synthesis, release, reuptake and degradation
Neurotransmitter transport
DPW-2 163 .0008 Nucleus accumbens basal ganglia Eukaryotic translation elongation
Transcription-coupled nucleotide-excision repair
SMC-1 29 .0014 Cortex Immune response
Immune system
CGD-1 114 .0022 Nucleus accumbens basal ganglia SRP-dependent protein targeting to membrane
Eukaryotic translation elongation
Note: N genes denotes the number of genes in a module; p value corrected for correlation gene expression, gene size, and gene density; Biological process
categories were derived from an over-representation analysis of module genes in gene ontology biological process terms and are corrected for multiple
testing.
Abbreviations: ALD, alcohol dependence; AOI, age of smoking initiation; CAN, cannabis initiation; CGD-1; cigarettes per day; DPW, drinks per week; SMC,
smoking cessation.
genes for each phenotype is provided in Table S7. We assessed the trans-synaptic signaling (module CAN-1), GABA synthesis, release,
overlap in credible genes across phenotypes. A total of 43 credible reuptake and degradation (module DPW-1; p = 1.39 × 10−6) and the
genes were prioritized in more than one phenotype (Table S8). Inter- immune response (module SMC-1; p = 1.64 × 10−67) (Tables 5 and
estingly, the genes SNRPA and ZNF780B had posterior probabilities S10). We extracted eMAGMA associations for genes that intersect
close to or equal to 1 for both SMC and CGD, while the S-PrediXcan both the enriched module and significant biological pathways
Z scores for these genes had opposite effect directions. This is consis- (Table S11). Several biological pathways had a relatively large propor-
tent with the inverse relationship between the phenotypes, and pro- tion of nominally significant eMAGMA associations. For example, 4 out
vides strong evidence of their involvement in substance use risk. of 8 overlapping module genes for the AOI-2 pathway “metabolism of
RNA” contained eMAGMA p values <.05 (Table S12). These data sup-
port the involvement of the gene co-expression modules in substance
3.4 | Network-based enrichment of substance use use, although the overlap between eMAGMA associations and biologi-
risk genes cal pathways is modest for several phenotype modules (e.g., DPW-1
“neurotransmitter transport” contains 2 genes with eMAGMA associa-
We tested for the enrichment of eMAGMA gene-based association sig- tions, one of which has a nominal p value <.05).
nals in brain gene co-expression networks. AOI, ALD, cannabis initia- There was strong preservation (Z score > 10) of gene connectivity
tion (CAN), CGD, DPW, and SMC each showed enrichment of gene- structure within significant modules across brain tissues (Figure 1), how-
based association signals in at least one module (Table 5). The module ever DPW-2 (anterior cingulate cortex enriched with developmental and
DPW-1 had the largest proportion of gene-based associations with a neurotransmitter pathways) had slightly lower preservation compared to
nominal p value <.05 (N = 9 genes; 22.5%), followed by the module other tissue modules. These data suggest modules and pathway enrich-
CGD-1 (N = 21; 18.4%) (Table S9). The module DPW-2 also harbored ments may be generalized across tissue types for substance use traits
two genes—TUFM (nucleus accumbens basal ganglia: p = 2.19 × 10−12) and provide further support to maximize tissue sample size for a single
and RPL9 (nucleus accumbens basal ganglia: p = 9.57 × 10−7)—with sig- brain tissue/region rather than maximizing brain region coverage.
nificant eMAGMA gene-based associations, highlighting their poten-
tially coordinated association with DPW. Furthermore, the genes
RPS26 and SNF8 had nominally significant eMAGMA P values in the 4 | DI SCU SSION
modules DPW-2 (RPS26, p = .0338; SNF8, p = .0006) and AOI-2 (RPS26,
p = 5.45 × 10−5; SNF8, p = .0006), suggesting some shared modular Genetic risk factors for substance use alter the expression of target
activity across substance use phenotypes (Table S9). A biological cate- genes, which may in turn influence the activity of highly co-expressed
gory association analysis of the enriched modules identified processes (but not necessarily co-regulated) genes in a tissue-specific manner.
related to RNA processing (module AOI-2; p = 5.12 × 10−8); We used expression quantitative trait loci from 13 brain tissues in a
GERRING ET AL. 169
F I G U R E 1 Preservation of gene
connectivity across co-expression
modules enriched with gene-based
association signals for substance use
traits. A Z-summary value greater than
10 suggests there is strong evidence a
module is preserved between the
reference and test network modules,
while a value between 2 and 10 indicates
weak to moderate preservation and a
value less than 2 indicates no
preservation. Gray boxes indicate the
tissue in which the significant association
was found. AOI, age of smoking initiation;
DPW, drinks per week; SMC, smoking
cessation [Color figure can be viewed at
wileyonlinelibrary.com]
novel gene-based test (eMAGMA) to identify candidate risk genes for the substantial transcriptomic overlap between alcohol consumption
8 substance use traits. The risk genes were subsequently tested for and alcohol use disorder, which is in line with genetic findings.
enrichment in tissue-specific gene co-expression networks to identify In a comparison of eMAGMA and other gene-based methods,
groups of highly correlated genes associated with substance use and eMAGMA performed similarly to S-PrediXcan in terms of number of sig-
improve the biological interpretation of gene-based associations. We nificant associations, while it shows a 1.2- to 7-fold reduction compared
identified 276 gene-based associations across 8 substance use traits, to MAGMA gene-based test results (Gerring et al., 2019). The latter find-
many of which were associated with multiple traits. Candidate risk ing is not unexpected since the total number of tested genes in
genes for 6 substance use traits (AOI, ALD, CAN, DPW, CGD, and eMAGMA (i.e., genes of which gene expression is controlled by at least
SMC) were enriched in at least one co-expression module, which con- one eQTL) is substantially lower than the total number of protein-coding
tained genes involved in gene expression and cellular metabolism, as genes (e.g., the number of tested genes in amygdala using eMAGMA is
well as neurotransmitter transport and trans-synaptic signaling. These 1,301 versus 18,128 tested genes using conventional MAGMA). How-
results demonstrate the utility of integrating genetic, gene expression, ever, while eMAGMA identifies fewer genes than its conventional
and gene co-expression data for the biological interpretation of com- MAGMA counterpart, the gene candidates are directly linked to the reg-
plex traits such as substance use. ulation of gene expression in a particular tissue and thereby offer a bio-
Our eMAGMA gene-based approach annotates target genes by logically meaningful substrate for follow-up analyses.
assigning genetic variants to genes based on tissue-specific eQTL Our approach enables the study of tissue-specific gene expression
information before testing for the enrichment of GWAS signals in tar- changes underlying substance use traits. The majority of the significant
get genes. The number of significant gene-based associations across associations were detected in cerebellum, a region that has been impli-
the 8 substance use traits ranged from 1 (alcohol dependence) to cated in addiction (Moulton, Elman, Becerra, Goldstein, &
124 (smoking initiation). The number of associations was a function of Borsook, 2014). While a robust functional mechanism specific to cere-
GWAS sample size, highlighting the importance of sample size in bellum has not been established, a recent study in mice showed that the
genetic studies of complex traits. In line with a strong genetic correla- cerebellum controls the reward circuitry and social behavior through
tion between problematic alcohol use disorder and number of DPW direct projections from the deep cerebellar nuclei to the brain's reward
(rg = 0.77) as reported by Zhou et al. (2020), the strongest trans- center (i.e., the ventral tegmental area) (Carta, Chen, Schott, Dorizan, &
criptomic correlation was observed between alcohol use disorder and Khodakhah, 2019). This suggests changes in gene expression in cerebel-
DPW (r = .26). These two alcohol-related traits also showed the larg- lum precipitate behavioral changes related to substance use. It should be
est overlap in terms of significant eMAGMA associations with noted, however, that cerebellar gene-based associations may be proxy
28 genes found to be significantly associated with both traits. Despite associations for a causal tissue or cell type, given cerebellum has the larg-
the moderate genetic correlation (rg = 0.42) between initiation of est number of brain tissue samples in GTEx thereby increasing statistical
tobacco and cannabis (Pasman et al., 2018), we observed a low trans- power to identify gene associations.
criptomic correlation (r = .17) with no overlap in significant eMAGMA Previous studies showed moderate to large correlations of additive
gene-associations. In summary, a systematic comparison between genetic effects across substance use traits (Nivard et al., 2016; Vink &
traits related to initiation, consumption versus dependence is limited Schellekens, 2018). We aimed to investigate whether the genetic corre-
due to differences in sample size, but the most striking observation is lations would be recapitulated in terms of gene-level associations.
170 GERRING ET AL.
Indeed, we observed substantial overlap for some trait combinations expected to be associated with substance use. Similarly, the enrichment
with high genetic correlations. For example, 82% of the genes that were of cannabis use risk genes in a module involved in glutamate receptor sig-
significantly associated with alcohol use disorder were also linked to the naling could be expected given psychotic symptoms associated with can-
number of DPW. This is higher than the genetic correlation between the nabis use is thought to involve altered glutamate signaling (Colizzi,
two phenotypes, which may be the result of eMAGMA assigning differ- McGuire, Pertwee, & Bhattacharyya, 2016). Interestingly, a module
ent genetic variants underlying each phenotype to the same gene, enriched with risk genes underlying DPW-1 was associated with the bio-
increasing the overlap between phenotypes. However, it is difficult to logical process “GABA synthesis, release, reuptake and degradation”
compare the level of overlap in gene-level associations, which relate to (p = 1.39 × 10−6). GABA (gamma-aminobutyric acid) is a critical inhibi-
specific loci, and genetic correlations, which measures genome-wide sig- tory neurotransmitter in the brain (Jembrek & Vlainic, 2015), and may be
nificant correlations. Interestingly, gene-level associations for lifetime involved in reward behavior in the early stages of substance addiction
cannabis use showed substantial overlap with DPW (32% overlap) and (Barker & Hines, 2020). Alcohol directly binds to GABA receptors, caus-
smoking initiation (27% overlap). One of the genes contributing to the ing the release of the neurotransmitter and inducing the sedative effects
genetic overlap is CADM2, which was found to be associated with 4 out associated with alcohol use disorder (Banerjee, 2014), suggesting genes
of 8 traits (i.e., alcohol consumption, alcohol use disorder, smoking initia- within this module may represent targets in the treatment of alcohol
tion, and cannabis use). Importantly, CADM2 was found to have consis- addiction. Therefore, these findings represent some of the first evidence
tent effect directions across phenotypes, highlight a shared biological that alternations in genetically regulated expression in anterior cingulate
effect in substance use. CADM2 was previously found to be associated cortex may influence alcohol consumption behavior through changes in
with a broad profile of risk-taking behavior and behavioral under-control the brain's reward circuitry and warrant follow-up validation studies.
(Boutwell et al., 2017; Morris et al., 2019; Sanchez-Roige et al., 2019). The findings of this study should be interpreted in view of the fol-
Furthermore, CADM2-knockout mice have increased locomotor activity lowing limitations. First, although GTEx is one of the most comprehen-
and reduced body weight, suggesting an important role in behavioral reg- sive genetic expression databases available to date, the statistical power
ulation and energy homeostasis (Yan et al., 2018). The robust association for eQTL discovery is still modest (GTEx Consortium, 2015). We
between CADM2 expression and multiple substance use traits highlights observed a strong correlation (Pearson's r = .87) between the post-
the need for functional studies to further explore the casual gene mortem sample size and the number of gene discoveries suggesting that
mechanisms. molecular studies of substance use phenotypes should maximize brain
We also detected the susceptibility locus at a chromosome 3p21.31 tissue sample size. It should be noted, however, as the sample size of
gene cluster for smoking-related phenotypes: smoking initiation, CGD, GTEx continues to increase the number of genes with significant eQTLs
and smoking cessation. The cluster covers seven genes with eMAGMA (eGenes) will plateau and further increases in sample size will have little
associations (AMT, GPX1, NCKIPSD, P4HTM, WDR6, DALRD3, and impact on biological conclusions. Second, our analyses focus on the role
CCDC71), several of which have been related to intelligence and cogni- of eQTLs in brain tissues while recent studies have shown that eQTL
tive functional measurement (Hill et al., 2019). None of the predicted effects may differ between cell types within a specific tissue (van der
expression models in our fine-mapping (FOCUS) analysis explained the Wijst et al., 2018). Cerebellum, for example, contains the largest number
observed S-PrediXcan associations for these genes, meaning a putative of neurons in the human brain (Van Essen, Donahue, & Glasser, 2018),
causal gene could not be prioritized in the locus. This is most likely due to potentially increasing the likelihood of identifying neuronal-specific
high linkage disequilibrium at the locus. Nonetheless, these associations pathways compared to other brain regions. Third, the identified genes
are consistent with the highly negative genetic correlation of smoking- should be seen as “candidates” as correlated levels of gene expression in
related phenotypes with years of education (Liu et al., 2019). Other over- high LD genomic regions makes it challenging to identify the true causal
lapping gene-based associations included MAPT and CRHR1 for alcohol genes (Wainberg et al., 2019). Finally, our gene co-expression analyses
use disorder and DPW. These genes are located within a common inver- rely on the stability (i.e., robustness) of gene networks both within and
sion polymorphism at chromosome 17q21.31, which is related to alter- between tissues (Gamazon et al., 2019).
ations in tissue-specific gene expression (de Jong et al., 2012) and In summary, we assessed gene targets and biological pathways
neurodegenerative disorders such as Parkinson's disease and Alzheimer's underlying eight phenotypes related to the initiation, use, or abuse of
disease (Myers et al., 2005; Skipper et al., 2004). However, a causative tobacco, alcohol, and cannabis. Our gene-based approach, eMAGMA,
role of individual genes within this locus in substance use has not been identified 276 candidate risk genes across these traits whose expression
established and cannot be inferred from the present data. is altered in at least one of 13 brain tissues. We confirmed substantial
Our network-based approach identified gene co-expression net- gene-based overlap between traits, with the highest overlap between
works enriched with GWAS signals of AOI, ALD, DPW, CAN, CGD, and DPW and alcohol use disorder. The gene CADM2, recently associated
SMC. The implicated modules were enriched in biological pathways with lifetime cannabis use, risk-taking behavior, and behavioral under-
related to cellular metabolism (AOI-2, “cellular metabolic process”, control, was found to be associated with half of the included traits. We
p = 1.19 × 10−9) and trans-synaptic signaling processes (CAN-1, “gluta- used gene co-expression networks in brain to identify broader, function-
−19
mate receptor signaling pathway,” p = 1.21 × 10 ), among others. The ally related modules (groups) of genes potentially implicated in substance
term “cellular metabolism” encompasses all chemical reactions involving use. Six gene modules across three phenotypes (AOI, alcohol consump-
the breakdown of drug compounds and alcohols and would therefore be tion, and smoking cessation) were enriched with gene-based
GERRING ET AL. 171
associations. One of the co-expression modules associated with number study of alcohol consumption and genetic overlap with other health-
of DPW, in anterior cingulate cortex, was enriched with biologically related traits in UK Biobank (N=112 117). Molecular Psychiatry, 22(10),
1376–1384. https://doi.org/10.1038/mp.2017.153
meaningful pathways related to GABA release and degradation,
Colizzi, M., McGuire, P., Pertwee, R. G., & Bhattacharyya, S. (2016). Effect
highlighting the utility of our approach in describing the molecular char- of cannabis on glutamate signalling in the brain: A systematic review
acteristics of alcohol consumption. The integration of summary statistics of human and animal evidence. Neuroscience & Biobehavioral Reviews,
from larger GWAS of measures of substance use with gene expression 64, 359–381. https://doi.org/10.1016/j.neubiorev.2016.03.010
de Jong, S., Chepelev, I., Janson, E., Strengman, E., van den Berg, L. H.,
data from brain tissues, provided by GTEx (Aguet et al., 2019) and other
Veldink, J. H., & Ophoff, R. A. (2012). Common inversion polymor-
consortia (Wang et al., 2018), will facilitate the translation of statistical phism at 17q21.31 affects expression of multiple genes in tissue-
associations to the discovery of causal genes and molecular mechanisms. specific manner. BMC Genomics, 13(1), 458. https://doi.org/10.1186/
1471-2164-13-458
de Leeuw, C. A., Mooij, J. M., Heskes, T., & Posthuma, D. (2015). MAGMA:
ACKNOWLEDGMENT
Generalized gene-set analysis of GWAS data. PLoS Computational Biol-
Eric R. Gamazon is supported by the National Human Genome ogy, 11(4), e1004219.
Research Institute of the National Institutes of Health under Award Delaneau, O., Marchini, J., & Consortium, T. 1000 G. P. (2014). Integrating
Numbers R35HG010718 and R01HG011138. sequence and array data to create an improved 1000 Genomes Project
haplotype reference panel. Nature Communications, 5, 3934.
Demontis, D., Rajagopal, V. M., Thorgeirsson, T. E., Als, T. D., Grove, J.,
AUTHOR CONTRIBUTIONS
Leppälä, K., … Børglum, A. D. (2019). Genome-wide association study
Zachary F. Gerring and Eske M. Derks: Conceptualization. Zachary implicates CHRNA2 in cannabis use disorder. Nature Neuroscience, 22
F. Gerring and Angela M. Vargas: Formal analysis. Zachary F. Gerring, Eric (7), 1066–1074. https://doi.org/10.1038/s41593-019-0416-1
R. Gamazon, Eske M. Derks: Methodology. Eric R. Gamazon, Eske eGTEx Project. (2017). Enhancing GTEx by bridging the gaps between
genotype, gene expression, and disease. Nature Genetics, 49,
M. Derks: Supervision. Zachary F. Gerring: Writing—original draft. Zachary
1664–1670.
F. Gerring, Eske M. Derks, Eric R. Gamazon: Writing—review and editing. Fryett, J. J., Inshaw, J., Morris, A. P., & Cordell, H. J. (2018). Comparison of
methods for transcriptome imputation through application to two
DIS CLOSUR E OF INTERESTS common complex diseases. European Journal of Human Genetics, 26
(11), 1658–1667. https://doi.org/10.1038/s41431-018-0176-5
Eric R. Gamazon receives an honorarium from Circulation Research of
Gamazon, E. R., Segrè, A. V., van de Bunt, M., Wen, X., Xi, H. S.,
the AHA as member of the Editorial Board. He performed consulting Hormozdiari, F., … Consortium, Gte. (2018). Using an atlas of gene reg-
for the City of Hope/Beckman Research Institute. All other authors ulation across 44 human tissues to inform complex disease- and trait-
report no conflicts of interest. associated variation. Nature Genetics, 50(7), 956–967. https://doi.org/
10.1038/s41588-018-0154-4
Gamazon, E. R., Wheeler, H. E., Shah, K. P., Mozaffari, S. V., Aquino-
DATA AVAI LAB ILITY S TATEMENT Michaels, K., Carroll, R. J., … Im, H. K. (2015). A gene-based association
The data that supports the findings of this study are available in the method for mapping traits using reference transcriptome data. Nature
supplementary material of this article. Genetics, 47(9), 1091–1098.
Gamazon, E. R., Zwinderman, A. H., Cox, N. J., Denys, D., & Derks, E. M.
(2019). Multi-tissue transcriptome analyses identify genetic mecha-
ORCID nisms underlying neuropsychiatric traits. Nature Genetics, 51(6),
Zachary F. Gerring https://orcid.org/0000-0002-2445-1266 933–940. https://doi.org/10.1038/s41588-019-0409-8
Eric R. Gamazon https://orcid.org/0000-0003-4204-8734 Gerring, Z. F., Gamazon, E. R., & Derks, E. M. (2019). A gene co-expression
network-based analysis of multiple brain tissues reveals novel genes
and molecular pathways underlying major depression. PLoS Genetics,
RE FE R ENC E S
15(7), e1008245.
Aguet, F., Barbeira, A. N., Bonazzola, R., Brown, A., Castel, S. E., Jo, B., … Gerring, Z. F., Mina-Vargas, A., & Derks, E. M. (2019). eMAGMA: An
Lappalainen, T. (2019). The GTEx Consortium atlas of genetic regula- eQTL-informed method to identify risk genes using genome-wide
tory effects across human tissues. BioRxiv, 787903. https://doi.org/ association study summary statistics. BioRxiv, 854315. https://doi.org/
10.1101/787903 10.1101/854315
Banerjee, N. (2014). Neurotransmitters in alcoholism: A review of neurobi- GTEx Consortium. (2015). Human genomics. The Genotype-Tissue Expres-
ological and genetic studies. Indian Journal of Human Genetics, 20(1), sion (GTEx) pilot analysis: Multitissue gene regulation in humans. Sci-
20–31. https://doi.org/10.4103/0971-6866.132750 ence, 348(6235), 648–660. https://doi.org/10.1126/science.1262110
Barker, J. S., & Hines, R. M. (2020). Regulation of GABA(A) receptor subunit Gusev, A., Ko, A., Shi, H., Bhatia, G., Chung, W., Penninx, B. W. J. H., …
expression in substance use disorders. International Journal of Molecular Pasaniuc, B. (2016). Integrative approaches for large-scale
Sciences, 21(12), 4445. https://doi.org/10.3390/ijms21124445. transcriptome-wide association studies. Nature Genetics, 48(3),
Boutwell, B., Hinds, D., 23andMe Research Team, Tielbeek, J., Ong, K. K., 245–252.
Day, F. R., & Perry, J. R. B. (2017). Replication and characterization of Hancock, D. B., Guo, Y., Reginsson, G. W., Gaddis, N. C., Lutz, S. M.,
CADM2 and MSRA genes on human behavior. Heliyon, 3(7), e00349. Sherva, R., … Johnson, E. O. (2018). Genome-wide association study
https://doi.org/10.1016/j.heliyon.2017.e00349 across European and African American ancestries identifies a SNP in
Carta, I., Chen, C. H., Schott, A. L., Dorizan, S., & Khodakhah, K. (2019). DNMT3B contributing to nicotine dependence. Molecular Psychiatry,
Cerebellar modulation of the reward circuitry and social behavior. Sci- 23(9), 1911–1919. https://doi.org/10.1038/mp.2017.193
ence, 363(6424), eaav0581. https://doi.org/10.1126/science.aav0581 Hill, W. D., Marioni, R. E., Maghzian, O., Ritchie, S. J., Hagenaars, S. P.,
Clarke, T.-K., Adams, M. J., Davies, G., Howard, D. M., Hall, L. S., McIntosh, A. M., … Deary, I. J. (2019). A combined analysis of geneti-
Padmanabhan, S., … McIntosh, A. M. (2017). Genome-wide association cally correlated traits identifies 187 loci and a role for neurogenesis
172 GERRING ET AL.
and myelination in intelligence. Molecular Psychiatry, 24(2), 169–181. Sanchez-Roige, S., Fontanillas, P., Elson, S. L., Gray, J. C., de Wit, H.,
https://doi.org/10.1038/s41380-017-0001-5 MacKillop, J., & Palmer, A. A. (2019). Genome-wide association studies
Hormozdiari, F., van de Bunt, M., Segrè, A. V., Li, X., Joo, J. W. J., of impulsive personality traits (BIS-11 and UPPS-P) and drug experi-
Bilow, M., … Eskin, E. (2016). Colocalization of GWAS and eQTL sig- mentation in up to 22,861 adult research participants identify loci in
nals detects target genes. The American Journal of Human Genetics, 99 the CACNA1I and CADM2 genes. The Journal of Neuroscience, 39(13),
(6), 1245–1260. 2562–2572. https://doi.org/10.1523/JNEUROSCI.2662-18.2019
James, S. L., Abate, D., Abate, K. H., Abay, S. M., Abbafati, C., Abbasi, N., … Sanchez-Roige, S., Palmer, A. A., Fontanillas, P., Elson, S. L., Adams, M. J.,
Murray, C. J. L. (2018). Global, regional, and national incidence, preva- Howard, D. M., … Clarke, T.-K. (2018). Genome-wide association study
lence, and years lived with disability for 354 diseases and injuries for meta-analysis of the alcohol use disorders identification test (AUDIT)
195 countries and territories, 1990–2017: A systematic analysis for in two population-based cohorts. American Journal of Psychiatry, 176
the Global Burden of Disease Study 2017. The Lancet, 392(10159), (2), 107–118.
1789–1858. https://doi.org/10.1016/S0140-6736(18)32279-7 Skipper, L., Wilkes, K., Toft, M., Baker, M., Lincoln, S., Hulihan, M., …
Jembrek, M. J., & Vlainic, J. (2015). GABA receptors: Pharmacological poten- Farrer, M. (2004). Linkage disequilibrium and association of MAPT H1
tial and pitfalls. Current Pharmaceutical Design, 21(34), 4943–4959. in Parkinson disease. American Journal of Human Genetics, 75(4),
https://doi.org/10.2174/1381612821666150914121624 669–677. https://doi.org/10.1086/424492
Langfelder, P., & Horvath, S. (2008). WGCNA: An R package for weighted van der Wijst, M. G. P., Brugge, H., de Vries, D. H., Deelen, P.,
correlation network analysis. BMC Bioinformatics, 9(1), 1–13. Swertz, M. A., Franke, L., … B Consortium. (2018). Single-cell RNA
Langfelder, P., Luo, R., Oldham, M. C., & Horvath, S. (2011). Is my network sequencing identifies cell type-specific cis-eQTLs and co-expression
module preserved and reproducible? PLoS Computational Biology, 7(1), QTLs. Nature Genetics, 50(4), 493–497. https://doi.org/10.1038/
e1001057. s41588-018-0089-9
Langfelder, P., Zhang, B., & Horvath, S. (2008). Defining clusters from a Van Essen, D. C., Donahue, C. J., & Glasser, M. F. (2018). Development
hierarchical cluster tree: The Dynamic Tree Cut package for R. Bioin- and evolution of cerebral and cerebellar cortex. Brain, Behavior and
formatics, 24(5), 719–720. Evolution, 91, 158–169. https://doi.org/10.1159/000489943
Liu, M., Jiang, Y., Wedow, R., Li, Y., Brazel, D. M., Chen, F., … H. A.-I Psy- Vink, J. M., & Schellekens, A. (2018). Relating addiction and psychiatric dis-
chiatry. (2019). Association studies of up to 1.2 million individuals yield orders. Science, 361(6409), 1323–1324. https://doi.org/10.1126/
new insights into the genetic etiology of tobacco and alcohol use. science.aav3928
Nature Genetics, 51(2), 237–244. https://doi.org/10.1038/s41588- Wainberg, M., Sinnott-Armstrong, N., Mancuso, N., Barbeira, A. N.,
018-0307-5 Knowles, D. A., Golan, D., … Kundaje, A. (2019). Opportunities and
Mancuso, N., Freund, M. K., Johnson, R., Shi, H., Kichaev, G., Gusev, A., & challenges for transcriptome-wide association studies. Nature Genetics,
Pasaniuc, B. (2019). Probabilistic fine-mapping of transcriptome-wide 51(4), 592–599. https://doi.org/10.1038/s41588-019-0385-z
association studies. Nature Genetics, 51(4), 675–682. https://doi.org/ Walters, R. K., Polimanti, R., Johnson, E. C., McClintick, J. N., Adams, M. J.,
10.1038/s41588-019-0367-1 Adkins, A. E., … 23andMe Research Team. (2018). Transancestral
Marees, A. T., Gamazon, E. R., Gerring, Z., Vorspan, F., Fingal, J., den, van GWAS of alcohol dependence reveals common genetic underpinnings
Brink, W., … Derks, E. M. (2019). Post-GWAS analysis of six substance with psychiatric disorders. Nature Neuroscience, 21(12), 1656–1669.
use traits improves the identification and functional interpretation of https://doi.org/10.1038/s41593-018-0275-1
genetic risk loci: Full list of International Cannabis Consortium mem- Wang, D., Liu, S., Warrell, J., Won, H., Shi, X., Navarro, F. C. P., …
bers. Drug and Alcohol Dependence, 206, 107703. https://doi.org/10. Gerstein, M. B. (2018). Comprehensive functional genomic resource
1016/j.drugalcdep.2019.107703 and integrative model for the human brain. Science, 362(6420),
Morris, J., Bailey, M. E. S., Baldassarre, D., Cullen, B., de Faire, U., eaat8464. https://doi.org/10.1126/science.aat8464
Ferguson, A., … Strawbridge, R. J. (2019). Genetic variation in CADM2 Yan, X., Wang, Z., Schmidt, V., Gauert, A., Willnow, T. E., Heinig, M., &
as a link between psychological traits and obesity. Scientific Reports, 9 Poy, M. N. (2018). Cadm2 regulates body weight and energy homeo-
(1), 7339. https://doi.org/10.1038/s41598-019-43861-9 stasis in mice. Molecular Metabolism, 8, 180–188. https://doi.org/10.
Moulton, E. A., Elman, I., Becerra, L. R., Goldstein, R. Z., & Borsook, D. 1016/j.molmet.2017.11.010
(2014). The cerebellum and addiction: Insights gained from neuroimag- Zhou, H., Sealock, J. M., Sanchez-Roige, S., Clarke, T.-K., Levey, D. F.,
ing research. Addiction Biology, 19(3), 317–331. https://doi.org/10. Cheng, Z., … Gelernter, J. (2020). Genome-wide meta-analysis of prob-
1111/adb.12101 lematic alcohol use in 435,563 individuals yields insights into biology
Myers, A. J., Kaleem, M., Marlowe, L., Pittman, A. M., Lees, A. J., and relationships with other traits. Nature Neuroscience, 23(7),
Fung, H. C., … Hardy, J. (2005). The H1c haplotype at the MAPT locus 809–818. https://doi.org/10.1038/s41593-020-0643-5
is associated with Alzheimer's disease. Human Molecular Genetics, 14
(16), 2399–2404. https://doi.org/10.1093/hmg/ddi241
Nivard, M. G., Verweij, K. J. H., Minica, C. C., Treur, J. L., Derks, E. M., SUPPORTING INF ORMATION
Stringer, S., … I. C. Consortium. (2016). Connecting the dots, genome- Additional supporting information may be found online in the
wide association studies in substance use. Molecular Psychiatry, 21(6),
Supporting Information section at the end of this article.
733–735. https://doi.org/10.1038/mp.2016.14
Pasman, J. A., Verweij, K. J. H., Gerring, Z., Stringer, S., Sanchez-Roige, S.,
Treur, J. L., … I. C. Consortium. (2018). GWAS of lifetime cannabis use
reveals new risk loci, genetic overlap with psychiatric traits, and a How to cite this article: Gerring ZF, Vargas AM, Gamazon ER,
causal influence of schizophrenia. Nature Neuroscience, 21(9), Derks EM. An integrative systems-based analysis of substance
1161–1170. https://doi.org/10.1038/s41593-018-0206-1 use: eQTL-informed gene-based tests, gene networks, and
Reimand, J., Arak, T., Adler, P., Kolberg, L., Reisberg, S., Peterson, H., &
biological mechanisms. Am J Med Genet Part B. 2021;186B:
Vilo, J. (2016). g:Profiler—A web server for functional interpretation of
gene lists (2016 update). Nucleic Acids Research, 44(Web Server issue, 162–172. https://doi.org/10.1002/ajmg.b.32829
W83–W89. https://doi.org/10.1093/nar/gkw199

Ajmg B 32829

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ajmg B 32829

Uploaded by

Copyright:

Available Formats

Received: 23 April 2020 Revised: 17 November 2020 Accepted: 27 November 2020

An integrative systems-based analysis of substance use:

Zachary F. Gerring1 | Angela Mina Vargas1 | Eric R. Gamazon2,3,4 |

1 | I N T RO DU CT I O N moderately heritable with a highly polygenic background, where hun-

162 © 2020 Wiley Periodicals LLC wileyonlinelibrary.com/journal/ajmgb Am J Med Genet. 2021;186B:162–172.

2.4 | Fine-mapping of causal gene sets

2.3 | eQTL-informed gene-level analysis of

TABLE 1 GWAS summary statistics of substance use phenotypes

Phenotype Total N Cohort N loci Reference

Phenotype N Top gene Chr p value Tissue

Module N genes p (adjusted) Tissue Biological process

You might also like