Platform for Personalized Oncology | Bioinformatics | Biochemistry

Platform for Personalized Oncology

Integrative analyses reveal novel molecular signatures associated with colorectal cancer relapse Subha Madhavan, Ph.D. Innovation Center for Biomedical Informatics Georgetown University Medical Center AMIA TBI, 2013

Driving Factors
• Need for an information continuum (care -> research -> back to care) • Incorporate “omics-based evidence” in clinical research and care settings • Collect data once and use it multiple times – quality control, secondary use for research • Connect research platforms to accelerate scientific discovery • Maintain data provenance and key findings to support funding opportunities

G-DOC links to many information silos
Pathway Studio
• Systems biology analysis • Literature mining

Cytoscape
• Visualization networks (pathways, interactions)

JMol/Marvin
• 3-D Structure and Molecule Visualization

Ingenuity Variant Analysis
• Variants • Pathways, GO, Literature

G-DOC
JBrowse
• Genome Visualization

Clinical Research
• REDCap • Medidata

EHR Heatmap Viewer
• Visualization of copy number data • ARIA • AMALGA • CENTRICITY

G-DOC information flow

https://gdoc.georgetown.edu

Over 350 registered users from six countries ~250 unique hits/month

Correlate abnormality/event with clinical parameters

What can we do with this platform? Colorectal Cancer Multi-Omics Study
Problem: – Colorectal cancer (CRC) is the third most commonly diagnosed cancer in the United States in both men and women – In 2013, an estimated 143,460 new cases are expected to be diagnosed, and 51,690 deaths from CRC are expected to occur in the United States – Approximately 75- 80% of patients with stage II colon cancers will be cured by surgery alone, however, 20-25% will still experience relapse and ultimately die of metastatic disease. Goals: – To obtain a multidimensional molecular “portrait” of cancer relapse – To find most informative features correlated with clinical outcome – To identify novel combinations of different types of biomarkers of relapse – To use the data for uploading to G-DOC web portal as a pilot multi-omics profiling study for further in silico analysis

Two workflows for identification of “major players”
• Biology-driven approach: Systems Biology analysis of differentially expressed features i.e. genes, microRNA etc Results: Major pathways and Bio functions affected in patients with relapse • Data-driven computational approach: Machine Learning algorithms: SVM, RF-ACE Results: Small number of most informative features associated with Relapse
Question: would these results converge on the same biology?

Gene Expression Analysis

Genes in tumor samples: Relapse vs. Relapse free PCA based on T-test p<0.05

Enrichment Analysis: Diff. Expressed Genes Bio-Functions Most Affected In Relapse Group

Sub-network enrichment: Cell processes
Gene Set Seed
inflammatory response

Overlapping Entities
EGFR,VCAM1,TNFSF11,IGF2,MAPK3,TACR3,CD5,NOS2,CSF3,HBEGF,HSPB1,CASP1, APP,RETN,ILK,HSPD1,EGR1,TACR1,CXCL10,CXCL11,CXCR2,SELL,PPARD,WNT5A,IR F1,ALOX15,CXCL1,PYCARD,SCARB1,IL1RN,IL7R,MMP7,TLR9,STAT1,LY96,TACR2,PR SS1,FPR2,ATF4,MMP12,FCGR3A,CFH,ICOS,CTLA4,IDO1,VIPR1,CD86,CXCL9,C4BPA, MC3R,GPR44,IL13RA2,CHI3L1,CXCL3,LCP2,SERPIND1,CD74,NLRP2,APOL1,GC,IRAK 3,CLEC4E,BTN1A1,CD300C,TNIP3,DST,IL18R1,FREM1 TRIM56,EGFR,VCAM1,TNFSF11,MAPK3,TG,CD5,NOS2,CSF3,HSPB1,CASP1,APP,RET N,HSPD1,EGR1,TACR1,CXCL10,PTPRC,CXCL11,CXCR2,SELL,PPARD,IRF1,CXCL1,P YCARD,IL1RN,IL7R,MMP7,TLR9,STAT1,LY96,TXNIP,TACR2,BCL2A1,FPR2,VEGFC,FC GR3A,CFH,ICOS,KLRK1,CTLA4,IDO1,ORM1,VIPR1,CD86,CXCL9,C4BPA,RCAN1,GPR4 4,TNFRSF13B,CXCL13,CRY2,F13A1,IL13RA2,CHI3L1,IBSP,PSMB9,LCP2,PIGA,ADAM8, GZMM,CR2,TAP1,ERVWE1,CD74,LBR,LAMP3,CSTA,UBD,APOL1,PYHIN1,GC,IRAK3,C LEC4E,FYB,BTN1A1,CD300C,S100A13,CLEC1A,MOAP1,IL18R1,SYNJ2BP TAS2R10,EGFR,VCAM1,TNFSF11,IGF2,MAPK3,TG,CD5,NOS2,CSF3,AQP3,HBEGF,HS PB1,CASP1,ATP2A1,APP,RETN,ILK,LPXN,FOLH1,E2F1,HSPD1,EGR1,TACR1,GJA5,FB XO32,DUSP6,CXCL10,PTPRC,CXCL11,CXCR2,SELL,PPARD,WNT5A,IRF1,ALOX15,CX CL1,PYCARD,SCARB1,DNM1,IRS2,IL1RN,IL7R,MMP7,FOXA2,TLR9,STAT1,TSC2,TXNI P,BCL2A1,ACVR2B,GNAI1,TP53INP1,FPR2,CPE,FAAH,GPX3,ATF4,VEGFC,MMP12,FC GR3A,CFH,ICOS,BTG2,KLRK1,CTLA4,MAOA,IDO1,ORM1,VIPR1,KCNK5,CD86,CXCL9, C4BPA,RCAN1,LMNA,OGDH,CA3,THRB,GPR44,SLURP1,SPRY2,GSTT1,TNFRSF13B,C XCL13,BCL2L14,SCIN,NUPR1,MKL1,PCSK9,SHBG,CHI3L1,CXCL3,IBSP,PSMB9,PIGA, RHOH,GZMM,CR2,EPHX1,EPM2A,GIMAP4,TAP1,SMAD6,CSRP1,NOV,MSRA,ENTPD5, GRIA2,RASSF3,RIN2,CD74,BCL2L10,LBR,TSC22D1,FSCN1,TES,PARVA,TACC1,NEDD 9,PRPF31,GPR87,CSTA,MNDA,PLK4,GBP1,SFRP5,SMG1,GALR2,AKR7A2,UBD,PIK3IP 1,PACS2,PARP15,S100A6,TIMM8A,FILIP1L,CD3D,APOL1,AIM2,PINK1,MZF1,DEDD2,FA IM2,SMPD3,HAGH,PAFAH2,PPM1A,SALL1,MOAP1,GRIN3A,PTPN7,OIP5,MYCT1,ACER 2,DMRT2,FREM1,SERPINB3,SERP2,CTRL,KCTD11,MUC17,DNASE1L1,TMEM109,CHA C1 EGFR,VCAM1,TNFSF11,IGF2,MAPK3,TG,CD5,NOS2,CSF3,AQP3,HBEGF,HSPB1,CASP 1,RETN,HSPD1,EGR1,TACR1,GJA5,CXCL10,PTPRC,PPARD,IRF1,SCARB1,IL1RN,TLR 9,STAT1,TSC2,FAAH,ATF4,VEGFC,FCGR3A,BTG2,CTLA4,IL11RA,MAOA,IDO1,ORM1,T HOP1,SHBG,PSMB9,EPHX1,SERPIND1,ERVWE1,HSD3B1,PECR,GALR2,S100A6,GC,D ST,ST3GAL6,KLRC3,STOX1 VCAM1,TNFSF11,TG,CD5,NOS2,CSF3,CASP1,APP,FOLH1,E2F1,HSPD1,EGR1,CXCL1 0,PTPRC,SELL,PYCARD,IL7R,TLR9,STAT1,TXNIP,VEGFC,FCGR3A,ICOS,KLRK1,CTLA 4,IDO1,VIPR1,CD86,CXCL9,RCAN1,TNFRSF13B,TAP1,ENTPD5,CD74,BTN1A1,CLEC1 A,MAGEC2 TNFSF11,MAPK3,CD5,NOS2,CSF3,E2F1,HSPD1,EGR1,CXCL10,PTPRC,SELL,PYCARD ,IL1RN,IL7R,TLR9,STAT1,AHNAK,ICOS,KLRK1,CTLA4,IDO1,VIPR1,KCNK5,CD86,TNFR SF13B,LCP2,CR2,NEDD9,CD3D TNFSF11,TG,CD5,NOS2,HSPD1,CXCL10,PTPRC,IRF1,DNM1,IL7R,STAT1,FCGR3A,ICO S,CTLA4,IDO1,CD86,THOP1,TNFRSF13B,PSMB9,CR2,TAP1,CD74,FSCN1,UBD,HLADMA,HLA-DMB EGFR,IGF2,TACR3,CD5,HBEGF,APP,TACR1,CXCL10,PTPRC,CXCL11,CXCR2,SELL,W NT5A,CXCL1,SCARB1,DNM1,TACR2,PRSS1,AHNAK,GNAI1,FPR2,FCGR3A,ICOS,KLR K1,CTLA4,ORM1,CXCL9,GPR44,SPRY2,TNFRSF13B,CXCL13,SCIN,CXCL3,LCP2,CR2, MCHR1,GALR2,RGS7,CD300E EGFR,VCAM1,MAPK3,CD5,NOS2,CSF3,HBEGF,HSPB1,APP,ILK,EGR1,CXCL10,CXCL1 1,CXCR2,SELL,PPARD,CXCL1,STAT1,FPR2,MMP12,ICOS,CTLA4,CD86,CXCL9,CXCL1 3,CXCL3,RHOH,VNN2

Total # of p-value Neighbors Overlap

8.13E-12

1237

68

immune response

2.31E-09

1855

82

apoptosis pregnancy T-cell response T-cell function antigen processing and presentation calcium mobilization leukocyte migration

6.71E-08

5105

165

1.12E-07

1052

52

3.30E-07 8.81E-07

650 460

37 29

1.07E-06

388

26

1.36E-06 2.94E-06

747 462

39 28

Proteins regulating cell processes of inflammatory response

Copy Number Analysis

DNA Copy Number Analysis: Affy SNP 6.0 arrays

• Raw Data  Probe Level Copy Number: 1.6 million probes • Probe Level  Segment Level Copy Number: 100K segments • Segment Level  CIN Index: Whole Chromosome: 22 Individual Cytobands: ~ 800

CIN Index - Cytoband level: Relapse vs. Relapse_Free, t-test, P<0.05
CIN Index Overall: 37 cytobands

relapse relapse-free

Subnetwork enrichment for genes located in regions with significant difference in copy number gains

Implication of copy number in CRC - Reference

Brosens et al. Cell Oncol (Dordr). 2011 Jun;34(3):215-23.. Deletion of chromosome 4q predicts outcome in stage II colon cancer patients. RESULTS: Stage II colon cancers of patients who had relapse of disease showed significantly more losses on chromosomes 4, 5, 15q, 17q and 18q.

In the microsatellite stable (MSS) subgroup (n = 28), only loss of cytobands 4q22.1- 4q35.2 was significantly associated with disease relapse

Whole Exome Sequencing – Variant Analysis

Pathway Level aggregation of variants: many related to inflammation signaling

Data driven approach
• Machine Learning Approaches: - Support Vector Machines with Random Feature Elimination (SVM-RFE)

- Random Forest for Integration and prioritization (RF-ACE)
- Bayesian Networks

Classification Algorithm: Support Vector Machine (SVM)

Gene Expression Tumors

microRNA Expression Tumors

Results of ROC Analysis for SVM classification

Data Type/Tissue Type Metabolites/Serum_Pos Gene Expression/Tumors miRNA Expression/Tumors CIN Index/Tumors

Classifier SVM SVM SVM SVM

AUC 0.7 0.81 0.84 0.87

Metabolites/Urine_Pos

SVM

0.9

Regulome Explorer with Top Predictor-Target pairs

RF-ACE results: Top Features Associated With CRC Relapse network viewer combination of genes, microRNA, copy number and metabolites

Enrichment Analysis for RF-ACE top ranked features: Genes and microRNAs combined

They do converge!

Next Steps:
• To refine systems biology findings
– Focusing on Inflammatory response pathways – Trends for co-expression pattern of specific signaling relevant to regulatory T-lymphocytes to narrow down validation experiments

• To refine multivariate analysis using predictive networks and downstream systems biology
– What would be the best candidate biomarkers?

Acknowledgements:

Lombardi Comprehensive Cancer Center: Genomics, Histopathology, and Metabolomics Shared Resources
Analytical Group: Krithika Bhuvaneshwar, Robinder Gauba, Lei Song, Yuriy Gusev Design & Interpretation Dr. Weiner, Dr. Byers, Dr. Gusev, Dr. Juhl

Innovation Center for Biomedical Informatics (ICBI)

Website : https://icbi.georgetown.edu Email : icbi@georgetown.edu

Sign up to vote on this title
UsefulNot useful