Multivariate and network tools for analysis and visualization of metabolomic data

Dmitry Grapov, Oliver Fiehn

West Coast Metabolomics Center, Genome Center University of California Davis

State of the art facility producing massive amounts of biological data

>13,000 samples/yr >160 studies ~32,000 data points/study

Stylized Analysis at the Metabolomic Scale

Analysis at the Metabolomic Scale

Challenges of Metabolic Analysis

Biological Analytical
Complex W I D E data ( variables >> samples) Inefficient access to domain knowledge Difficulty of translating experimental findings into actionable biological interpretation(s)

Cycle of Scientific Discovery

Data Acquisition Data Processing

Hypothesis Generation

Network Mapping


Network Mapping
1. Network of metabolite relationships

2. Statistical and multivariate analysis

3. Mapping

analysis results to the network

1. Network Generation
Define connections between metabolites

Biochemical (substrate/product) Database lookup Web query

Chemical (structural or spectral similarity ) fingerprint generation
Empirical (dependency) correlation, partial-correlation

1. Multivariate Modeling
case vs. control x treatment time-dependent differences among groups

change in time

difference between groups Predictive Modeling PLS, OPLS, NNMF, random forests, ANN, SVM, etc. Clustering HCA, k-NN, k-means, SOM, etc.

Can you spot the difference?

Multivariate Predictive Modeling

Group 2

Group 1




3. Network Mapping
Analysis results Network Annotation Mapped Network

Treatment Effects Network

Metabolites Shape = increase/decrease Size = importance (loading) Color = correlation

Connections red = Biochemical relationships violet = Structural similarity

Time Course Network

Partial correlation Network

Gaussian Markov Network (intervention)

Available Tools (free as in speech and beer) /projects/imdev
Bioinformatics. 2012 Sep 1;28(17):2288-90. doi: 10.1093/bioinformatics/bts439 dgrapov/devium

Biological Database Translations in R Chemical Translation System (CTSgetR) Chemical Identifier Resolver (CIRgetR) /dgrapov

This research was supported in part by NIH 1 U24 DK097154