You are on page 1of 16

Multivariate and network tools for analysis and visualization of metabolomic data

Dmitry Grapov, Oliver Fiehn


West Coast Metabolomics Center, Genome Center University of California Davis

State of the art facility producing massive amounts of biological data


>13,000 samples/yr >160 studies ~32,000 data points/study

Stylized Analysis at the Metabolomic Scale

Nucl. Acids Res. (2008) 36 (suppl 2): W423-W426.doi: 10.1093/nar/gkn282

Analysis at the Metabolomic Scale

Challenges of Metabolic Analysis


Biological Analytical
Complex W I D E data ( variables >> samples) Inefficient access to domain knowledge Difficulty of translating experimental findings into actionable biological interpretation(s)

Cycle of Scientific Discovery


Hypothesis
Data Acquisition Data Processing

Hypothesis Generation

Network Mapping

Data

Network Mapping
1. Network of metabolite relationships

2. Statistical and multivariate analysis

3. Mapping

analysis results to the network

1. Network Generation
Define connections between metabolites

Biochemical (substrate/product) Database lookup Web query


Chemical (structural or spectral similarity ) fingerprint generation
BMC Bioinformatics 2012, 13:99 doi:10.1186/1471-2105-13-99

Empirical (dependency) correlation, partial-correlation

1. Multivariate Modeling
case vs. control x treatment time-dependent differences among groups

change in time

difference between groups Predictive Modeling PLS, OPLS, NNMF, random forests, ANN, SVM, etc. Clustering HCA, k-NN, k-means, SOM, etc.

Can you spot the difference?


Univariate
Multivariate Predictive Modeling

Group 2

Group 1

ANOVA

PCA

PLS

3. Network Mapping
Analysis results Network Annotation Mapped Network

Treatment Effects Network


Metabolites Shape = increase/decrease Size = importance (loading) Color = correlation

Connections red = Biochemical relationships violet = Structural similarity

Time Course Network

Partial correlation Network

Gaussian Markov Network (intervention)

Available Tools (free as in speech and beer)


http://sourceforge.net /projects/imdev
Bioinformatics. 2012 Sep 1;28(17):2288-90. doi: 10.1093/bioinformatics/bts439

https://github.com/ dgrapov/devium

Biological Database Translations in R Chemical Translation System (CTSgetR) https://github.com Chemical Identifier Resolver (CIRgetR) /dgrapov

dgrapov@ucdavis.edu metabolomics.ucdavis.edu

This research was supported in part by NIH 1 U24 DK097154