You are on page 1of 24

See

discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/268197074

Human Genetics Shape the Gut Microbiome


ARTICLE in CELL NOVEMBER 2014
Impact Factor: 33.12 DOI: 10.1016/j.cell.2014.09.053

CITATIONS

DOWNLOADS

VIEWS

14

985

434

13 AUTHORS, INCLUDING:
Omry Koren

Michelle Beaumont

Bar Ilan University

King's College London

37 PUBLICATIONS 2,887 CITATIONS

1 PUBLICATION 14 CITATIONS

SEE PROFILE

SEE PROFILE

Rob Knight

Jordana T Bell

University of California, San Diego

King's College London

345 PUBLICATIONS 31,818 CITATIONS

51 PUBLICATIONS 1,720 CITATIONS

SEE PROFILE

SEE PROFILE

Available from: Ruth Ley


Retrieved on: 07 July 2015

Article

Human Genetics Shape


the Gut Microbiome
Julia K. Goodrich,1,2 Jillian L. Waters,1,2 Angela C. Poole,1,2 Jessica L. Sutter,1,2 Omry Koren,1,2,7 Ran Blekhman,1,8
Michelle Beaumont,3 William Van Treuren,4 Rob Knight,4,5,6 Jordana T. Bell,3 Timothy D. Spector,3 Andrew G. Clark,1
and Ruth E. Ley1,2,*
1Department

of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA


of Microbiology, Cornell University, Ithaca, NY 14853, USA
3Department of Twin Research and Genetic Epidemiology, Kings College London, London SE1 7EH, UK
4Department of Chemistry and Biochemistry, University of Colorado, Boulder, CO 80309, USA
5Biofrontiers Institute, University of Colorado, Boulder, CO 80309, USA
6Howard Hughes Medical Institute, University of Colorado, Boulder, CO 80309, USA
7Present address: Faculty of Medicine, Bar Ilan University, Safed 13115, Israel
8Present address: Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN 55455, USA
*Correspondence: rel222@cornell.edu
http://dx.doi.org/10.1016/j.cell.2014.09.053
2Department

SUMMARY

Host genetics and the gut microbiome can both influence metabolic phenotypes. However, whether host
genetic variation shapes the gut microbiome and interacts with it to affect host phenotype is unclear. Here,
we compared microbiotas across >1,000 fecal samples obtained from the TwinsUK population, including
416 twin pairs. We identified many microbial taxa
whose abundances were influenced by host genetics.
The most heritable taxon, the family Christensenellaceae, formed a co-occurrence network with other
heritable Bacteria and with methanogenic Archaea.
Furthermore, Christensenellaceae and its partners
were enriched in individuals with low body mass
index (BMI). An obese-associated microbiome was
amended with Christensenella minuta, a cultured
member of the Christensenellaceae, and transplanted
to germ-free mice. C. minuta amendment reduced
weight gain and altered the microbiome of recipient
mice. Our findings indicate that host genetics influence the composition of the human gut microbiome
and can do so in ways that impact host metabolism.
INTRODUCTION
The human gut microbiome has been linked to metabolic disease and obesity (Karlsson et al., 2013; Le Chatelier et al.,
2013; Ley et al., 2005; Qin et al., 2012; Turnbaugh et al., 2009).
Variation in host genetics can also underlie susceptibility to
metabolic disease (Frayling et al., 2007; Frazer et al., 2009; Herbert et al., 2006; Yang et al., 2012). Despite these shared effects,
the relationship between host genetic variation and the diversity
of gut microbiomes is largely unknown.
The gut microbiome is environmentally acquired from birth
(Costello et al., 2012; Walter and Ley, 2011), therefore it may func-

tion as an environmental factor that interacts with host genetics to


shape phenotype, as well as a genetically determined attribute
that is shaped by, and interacts with, the host (Bevins and Salzman, 2011; Spor et al., 2011; Tims et al., 2011). Because the microbiome can be modified for therapeutic applications (Borody
and Khoruts, 2012; Hamilton et al., 2013; Khoruts et al., 2010;
van Nood et al., 2013), it constitutes an attractive target for
manipulation. Once the interactions between host genetics and
the microbiome are understood, its manipulation could be optimized for a given host genome to reduce disease risk.
Although gut microbiomes can differ markedly in diversity
across adults (Human Microbiome Project Consortium, 2012;
Qin et al., 2010), family members are often observed to have
more similar microbiotas than unrelated individuals (Lee et al.,
2011; Tims et al., 2013; Turnbaugh et al., 2009; Yatsunenko
et al., 2012). Familial similarities are usually attributed to shared
environmental influences, such as dietary preference, a powerful
shaper of microbiome composition (Cotillard et al., 2013; David
et al., 2014; Wu et al., 2011). Yet related individuals share a larger
degree of genetic identity, raising the possibility that shared genetic composition underlies familial microbiome similarities.
Support for a host genetic effect on the microbiome comes
mostly from studies taking a targeted approach. For instance,
the concordance rate for carriage of the methanogen Methanobrevibacter smithii is higher for monozygotic (MZ) than dizygotic (DZ)
twin pairs (Hansen et al., 2011), and studies comparing microbiotas
between human subjects differing at specific genetic loci have
shown gene-microbiota interactions (Frank et al., 2011; Khachatryan et al., 2008; Rausch et al., 2011; Rehman et al., 2011;
Wacklin et al., 2011). A more general approach to this question
has linked genetic loci with abundances of gut bacteria in mice
(Benson et al., 2010; McKnite et al., 2012), but in humans, a general
approach (e.g., using twins) has failed to reveal significant genotype effects on microbiome diversity (Turnbaugh et al., 2009; Yatsunenko et al., 2012). Thus, heritable components of the human gut
microbiome remain to be identified using an unbiased approach.
Here, we assessed the heritability of the gut microbiome with a
well-powered twin study. Comparisons between MZ and DZ twin
Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. 789

CELL 7821

Figure 1. Microbiomes Are More Similar for


Monozygotic Than Dizygotic Twins

***
Unweighted UniFrac

More similar

More different

***

0.8

Clostridiaceae

0.7

Enterobacteriaceae
Rikenellaceae

0.6

Lachnospiraceae

0.5

Ruminococcaceae

0.4

Bacteroidaceae
0 0.05 0.1 0.15 0.2 0.25
Average relative abundance

MZ
DZ
UN
All Bacteria
and Archaea

***

***

0.8
0.7
0.6
0.5
0.4

0.15
0.10
0.05
MZ
DZ
UN
Lachnospiraceae

MZ
DZ
UN
Ruminococcaceae

***
*
1.0
0.8
0.6

ns

***

ns
Unweighted UniFrac

Bray-Curtis

**

0.9
Weighted UniFrac

Unweighted UniFrac

***

0.6
0.5

(A and CF) Boxplots of b diversity distances between microbial communities obtained when comparing individuals within twinships for monozygotic
(MZ) twin pairs and dizygotic (DZ) twin pairs, and
between unrelated individuals (UN). (A) The whole
microbiome. (C) The bacterial family Ruminococcaceae. (D and E) The bacterial family Lachnospiraceae. (F) The family Bacteroidaceae. The specific
distance metric used in each analysis is indicated on
the axes. *p < 0.05, **p < 0.01, ***p < 0.001 for Students t tests with 1,000 Monte Carlo simulations.
(B) The average relative abundances in the
whole data set for the top six most prevalent
bacterial families (unrarefied data, see Experimental Procedures).
See also Figure S1 and Table S1.

ns

lated). In addition, we collected longitudinal samples from 98 of these individuals


(see Supplemental Information available
online). Most subjects were female,
ranging in age from 23 to 86 years (average
age: 60.6 0.3 years). The average BMI of
the subjects was 26.25 ( 0.16) with the
following distribution: 433 subjects had
a low to normal BMI (<25), 322 had an overweight BMI (25-30), 183 were obese (>30),
and 39 individuals in which the current
BMI status was unknown. We generated
78,938,079 quality-filtered sequences
that mapped to the Bacteria and Archaea
in the Greengenes database (average sequences per sample: 73,023 889).

0.4

Microbiome Composition and


Richness
0.2
0.4
We sorted sequences into 9,646 opera0.1
tional taxonomic units (OTUs, R97% ID).
MZ
DZ
UN
MZ
DZ
UN
Of these OTUs, 768 were present in at least
Bacteroidaceae
50% of the samples. Taxonomic classifiLachnospiraceae
cation revealed a fairly typical Western
diversity profile: the dominant bacterial
phyla were Firmicutes (53.9% of total
pairs allowed us to assess the impact of genotype and early shared sequences), Bacteroidetes (35.3%), Proteobacteria (4.5%),
environment on their gut microbiota. Our study addressed the with Verrucomicrobia, Actinobacteria, and Tenericutes each
following questions: Which specific taxa within the gut microbiome comprising 2% of the sequences, and a tail of rare bacterial phyla
are heritable, and to what extent? Which predicted metagenomic that together accounted for the remaining 1% of the sequences.
The most widely shared methanogen was M. smithii (64% of
functions are heritable? How do heritable microbes relate to host
BMI? Finally, we use fecal transplants into germ-free mice to people, using nonrarefied data), followed by vadinCA11, a member of the Thermoplasmata with no cultured representatives
assess the phenotype effects of the most heritable taxon.
(6%), Methanosphaera stadtmanae (4%), and Methanomassiliicoccus (4%, a member of the Thermoplasmata). Forty-six
RESULTS
of the 61 samples in which we detected vadinCA11 also
contained M. smithii, indicating that the two most dominant
Twin Data Set
We obtained 1,081 fecal samples from 977 individuals: 171 MZ archaeal taxa are not mutually exclusive. Faiths PD was posiand 245 DZ twin pairs, two from twin pairs with unknown zygosity, tively correlated with the relative abundance of the family Methand 143 samples from just one twin within a twinship (i.e., unre- anobacteriaceae (rho = 0.42 rarefied, 0.37 for transformed

0.3

790 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.


CELL 7821

MZ < DZ
Archaea

-0.2

MZ > DZ
0.2

Methanogens

Figure 2. OTU Relative Abundances Are


More Highly Correlated within MZ Than DZ
Twin Pairs
Left: a phylogeny of taxa in the TwinsUK study
(Greengenes tree pruned to include only OTUs
shared by 50% of the TwinsUK participants).
Right: corresponding twin-pair intraclass correlation coefficients (ICCs). ICCs were calculated for
each OTU and the difference in correlation coefficients for MZ twin pairs versus DZ twin pairs.
Bars pointing to the right indicate that the difference is positive (i.e., MZ ICCs > DZ ICCs) and bars
pointing to the left indicate negative differences
(DZ ICCs > MZ ICCs). The scale bar associated
with the phylogeny shows substitutions/site.
See also Figure S2.

Lachnospiraceae

Firmicutes

Ruminococcaceae

pairs. For each twin pair we generated intraclass correlation coefficients (ICCs) for
the relative abundances of OTUs. Mean
Bacteria
ICCs were significantly greater for MZ
Bacteroidetes
0.1
compared to DZ twin pairs (Wilcoxon
Proteobacteria
signed rank test on ICCs at the OTU
level, p = 6 3 10 04; Figure 2). Because
many OTUs are closely phylogenetically
counts, p < 1 3 10 11), which corroborates previous reports of related, their abundances may not be independent, which may
inflate levels of significance. To account for this effect, we mainhigher richness associating with methanogens.
tained the structure of the phylogenetic tree but permuted the
MZ and DZ labels in 10,000 tests to generate randomized
Broad Diversity Comparisons between MZ and DZ
ICCs. As an independent validation, we also applied these anaTwin Pairs
We observed that microbiotas were more similar overall within lyses to two previously published data sets generated originating
individuals (resampled) than between unrelated individuals (p < in a population of twins from Missouri, USA: Turnbaugh (Turn0.001 for weighted and unweighted UniFrac and Bray-Curtis us- baugh et al., 2009), which described 54 twin pairs ranging from
ing a Students t test with 1,000 Monte Carlo simulations) (Table 21 to 32 years of age, and Yatsunenko (Yatsunenko et al.,
S1A) and were also more similar within twin pairs compared to 2012), which included 63 twin pairs with an age range of 1330
unrelated individuals (p < 0.009 for weighted and unweighted years of age. Mean ICCs of OTU abundances were significantly
UniFrac and Bray-Curtis) (Figures 1 and S1; Table S1). MZ twin greater for MZ compared to DZ twin pairs in both of these data
pairs had more similar microbiotas than DZ twins for the un- sets (significance by permutation: p < 0.001 and 0.047 respecweighted UniFrac metric (p = 0.032), but not the weighted Uni- tively; Figure S2), corroborating our observations.
Frac and Bray-Curtis metrics (Figures 1A and S1). As greater
similarities for MZ versus DZ twin pairs are seen in unweighted Heritability Estimates for OTUs and Predicted Functions
UniFrac but not abundance-based metrics, MZ similarities are We estimated heritability using the twin-based ACE model,
driven by shared community membership rather than structure. which partitions the total variance into three component sources:
We next constrained the distance metric analyses to the three genetic effects (A), common environment (C), and unique envimost dominant bacterial families: the Lachnospiraceae and Ru- ronment (E) (Eaves et al., 1978). The largest proportion of variminococcaceae (Firmicutes) and Bacteroidaceae (Figure 1B). ance in abundances of OTUs could be attributed to the twins
We observed greater similarities for MZ compared to DZ twins unique environments (i.e., E > A; Table S2). However, for the
using the unweighted UniFrac metric within the Ruminococca- majority of OTUs (63%), the proportion of variance attributed
ceae family (Figure 1C). Within the Lachnospiraceae family, to genetic effects was greater than the proportion of variance
significantly greater similarity for MZ compared to DZ twins attributed to common environment (A > C; Table S2).
From the ACE model, we calculated 95% confidence intervals
emerged using the weighted UniFrac and Bray-Curtis metrics
(Figures 1D and 1E). In contrast, when restricted to the Bacteroi- for the heritability estimates and determined the significance of
daceae family, we found that MZ and DZ twins had similar pair- the heritability values using a permutation method to generate
wise diversity using all three metrics (Figures 1F, S1B, and S1E). nominal p values (Table S2). We found a high correlation between the tail probability for inclusion of zero in the confidence
interval of heritability and the p values obtained from the permuMZ Twins Have More Highly Correlated Microbiotas
We next asked if the abundances of specific taxa were generally tation tests (rho = 0.872, p < 10 15), indicating substantial
more highly correlated within MZ twin pairs compared to DZ twin consistency across these tests. Although heritability studies
Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. 791
CELL 7821

TwinsUK heritability (A)


A

Archaea

Methanogens

>0.4

Significance of heritability
P value
0
>0.1

0
Lachnospiraceae

Firmicutes

Ruminococcaceae

Figure 3. Heritability of Microbiota in the


TwinsUK Data Set
(A) OTU Heritability (A from ACE model) estimates
mapped onto a microbial phylogeny and displayed
using a rainbow gradient from blue (A = 0) to red (A
R 0.4). This phylogenetic tree was obtained from
the Greengenes database and pruned to include
only nodes for which at least 50% of the TwinsUK
participants were represented.
(B) The significance for the heritability values
shown in (A) was determined using a permutation
test (n = 1,000) and are shown on the same phylogeny as in (A). P values range from 0 (red) to >0.1
(blue).
See also Figure S3 and Table S2.

Christensenellaceae
Bacteria
0.1

Bacteroidetes
Proteobacteria
Bifidobacteriaceae

traditionally report confidence intervals and nominal p values


only, we also generated FDR-corrected p values (Table S2).
We also applied the ACE model to the abundances of sequences mapping to each node in the phylogeny. Across the
three studies, the nodes of the phylogeny with the strongest heritabilities lie within the Ruminococcaceae and Lachnospiraceae
families, and the Bacteroidetes are mostly environmentally
determined (Figures 3 and S3). Subsets of the Archaea are
also heritable in the TwinsUK and the Yatsunenko studies (the
Turnbaugh study did not include data for Archaea).
We characterized the longitudinal stability of each OTU by
calculating the ICCs of the OTU abundance across repeat samples, which consisted of two samples collected from the same
individual at different times. By permuting these repeat sample
ICCs, we found that heritable OTUs (A > 0.2) were more stable
(ICC > 0.6) than expected by chance (Figure S3E; p < 0.001, p
value was determined as the fraction of permutations that had
greater than or equal to the observed number of OTUs that are
both heritable and stable).
We used PICRUSt (Langille et al., 2013) to produce predicted
metagenomes from the 16S rRNA gene sequence data and
applied the ACE model to estimate the heritability of predicted
abundances of conserved orthologous groups (COGs). This
analysis revealed six functions with heritabilities A > 0.2 and
nominal p values < 0.05 (p values are generated by permutation
testing; Extended Experimental Procedures; Table S2). Correcting for multiple comparisons, one category, secondary metabolites biosynthesis, transport and catabolism (Q), passed a
stringent FDR (A = 0.32, 95% confidence interval [CI] = 0.16
0.44). We also tested a diversity for heritability and found that it
was not heritable.
The Family Christensenellaceae Is the Most Highly
Heritable Taxon
The most heritable taxon overall was the family Christensenellaceae (A = 0.39, 95% CI = 0.210.49, p = 0.001; Figure 4A; Table
S2; this taxon passes a stringent FDR) of the order Clostridiales.
Christensenellaceae was also highly heritable in the Yatsunenko
792 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.
CELL 7821

data set (A = 0.62, 95% CI = 0.380.77;


Figure 4B; Table S2). We repeated this
analysis for the taxa abundances with
the effect of BMI regressed out, and
results were highly correlated (Pearson correlation = 0.95, p <
1 3 10 15).
Christensenellaceae Is the Hub in a Co-Occurrence
Network with Other Heritable Taxa
We observe a module of co-occurring heritable families, and the
hub (node connected to most other nodes) of this module is the
family Christensenellaceae (Figures 5A and S4A). The heritable
module includes the families Methanobacteriaceae (Archaea)
and Dehalobacteriaceae (Firmicutes) and the orders SHA-98
(Firmicutes), RF39 (Tenericutes), and ML615J-28 (Tenericutes).
The Christensenellaceae network is anticorrelated with the Bacteroidaceae and Bifidobacteriaceae families. We validated these
results by applying this method to the family-level taxonomic
abundances in the Yatsunenko data set (as this one is most technically similar to the TwinsUK data set), where we also found the
same Christensenellaceae-centered module of heritable families
anticorrelated to the Bacteroidaceae/Bifidobacteriaceae module (Figure S4B).
Christensenellaceae Associates with a Low BMI
The family Christensenellaceae was significantly enriched in
subjects with a lean BMI (<25) compared to those with an obese
BMI (>30; Benjamini-Hochberg corrected p value < 0.05 from t
test on transformed counts; Table S2). Other members of the
Christensenellaceae consortium were also enriched in leanBMI subjects: the Dehalobacteriaceae, SHA-98, RF39, and the
Methanobacteriaceae (Figure 5B). Overall, a majority (n = 35) of
the OTUs with highest heritability scores (A > 0.2, nominal p <
0.05) were enriched in the lean subjects. A subset of OTUs classified as Oscillospira were enriched in lean subjects, and
M. smithii, although not significantly heritable, was positively
associated with a lean BMI.
Christensenellaceae Is Associated with Health in
Published Data Sets
Because the names Christensenella and Christensenellaceae
were only recently assigned to the bacterial phylogeny, we

Figure 4. MZ Twin Pairs Have Higher Correlations of Christensenellaceae Than DZ


Twin Pairs in TwinsUK and Yatsunenko
Data Sets

Scatter plots comparing the abundances of


Christensenellaceae in the gut microbiota of MZ
and DZ co-twins. Christensenellaceae abundances were transformed and adjusted to control
for technical and other covariates (Residuals are
plotted, see Extended Experimental Procedures)
and the data are separated by zygosity (MZ or DZ
twins).
(A) TwinsUK data set.
(B) Yatsunenko data set.

assessed the abundances of sequences assigned to these taxa


in previously published studies. This analysis revealed that members of the Christensenellaceae were enriched in fecal samples
of healthy versus pediatric and young adult IBD patients (p <
0.05) (Papa et al., 2012). Christensenellaceae were at greater
abundance in lean BMI compared to obese-BMI twins in the
Turnbaugh data set, but the difference was not quite significant
(time-point 2 samples, p = 0.07). In a case study of the development of an infants gut microbiome (Koenig et al., 2011), Christensenellaceae was present at 8.6% in the mothers stool at the
time of birth and at 20% in the infants meconium. We also noted
that Christensenellaceae is enriched in omnivorous compared to
herbivorous and carnivorous mammals (Muegge et al., 2011).
However, we did not find a relationship between Christensenellaceae and diet information in human studies (Wu et al., 2011;
Martnez et al., 2010; Koren et al., 2012).
Christensenellaceae Is Associated with Reduced
Weight Gain in Germ-free Mice Inoculated with Lean
and Obese Human Fecal Samples
Methanogens co-occurred with Christensenellaceae in this study
and have been linked to low BMI in previous studies. Because of
this previous association with a low-BMI, we wanted to ensure
that methanogens were present in the Christensenellaceae con-

sortium in an initial experiment exploring


its effect on weight phenotypes. Therefore, we selected 21 donors for fecal transfer to germ-free mice based on BMI status
(low or high) and presence or absence
of the methanogen-Christensenellaceae
consortium. Donors fell into one of four
categories: lean with detectable methanogens (L+), lean without methanogens (L ),
obese with methanogens (O+), or obese
without methanogens (O ). The abundance of Christensenellaceae positively
correlated with the abundance of methanogens in donor stool (rho = 0.72, p =
0.0002), indicating that methanogen abundance was a good proxy for the methanogen-Christensenellaceae consortium.
A 16S rRNA analysis of the fecal microbiomes before and after transfer to germfree mice showed that although members of the Christensenellaceae were present throughout the experiment in recipient mice
(Figure 6A), M. smithii was undetectable in the mouse fecal or
cecal samples (the first sampling was at 20 hr postinoculation).
At 20 hr postinoculation, the microbiota had shifted dramatically
in diversity from the inoculation, but by day 5 had shifted back
partially and remained fairly stable through day 21 (Figures 6B,
6C, S5A, and S5B). Abundances of Christensenella were correlated with PC3 (abundances rarefied at 55,000 sequences per
sample versus unweighted UniFrac; Spearman rho = 0.59, p <
2.2 3 10 16), and PC3 captured the differences between the
four donor groups (Figure 6D). We observed a trend for Christensenella abundances as highest in the L+ group and lowest in the
O group (Figure 6A), which mirrored the weight differences between those groups: the percent change in body weights of the
recipient mice was significantly lower in the L+ group compared
to the O group (day 12, p < 0.05, t test; Figures 6E and 6F). Cecal
levels of propionate and butyrate were significantly elevated in
mice receiving methanogen-positive compared to methanogen-negative microbiomes controlling for the effect of donor
BMI (two-way ANOVA, p < 0.05 for both SCFAs; Figures S5C
S5E). Stool energy content was significantly higher for the methanogen-positive microbiomes at day 12, when the percent
changes in weight were greatest (two-way ANOVA, p = 0.004,
Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. 793

CELL 7821

Tenericutes;
Unclassified ML615J-28

Firmicutes;
Christensenellaceae

ac

Euryarchaeota;
Methanobacteriaceae

Firmicutes;
Unclassified SHA-98

Firmicutes;
Dehalobacteriaceae

Actinobacteria;
Bifidobacteriaceae

m
n
q
s i

aa

ab

Tenericutes;
Unclassified RF39

d
f

p
e

Bacteroidetes;
Bacteroidaceae
0

Heritability(A)

>0.4

Tenericutes;
Unclassified ML615J-28

Actinobacteria;
Bifidobacteriaceae

Firmicutes;
Christensenellaceae

ac

Euryarchaeota;
Methanobacteriaceae

Firmicutes;
Unclassified SHA-98

Firmicutes;
Dehalobacteriaceae

t
a

m
n
q
s i
w
aa

c
h

ab

Tenericutes;
Unclassified RF39

d
f

p
e

Bacteroidetes;
Bacteroidaceae
0

BMI association
q value

>0.05

Figure 5. Christensenellaceae Is the Hub of a Consortium of Co-occurring Heritable Microbes that Are Associated with a Lean BMI
The same network built from SparCC correlation coefficients between sequence abundances collapsed at the family level. The nodes represent families and the
edges represent the correlation coefficients between families. Edges are colored blue for a positive correlation and gray for a negative correlation, and the weight
of the edge reflects the strength of the correlation. Nodes are positioned using an edge-weighted force directed layout.
(A) Nodes are colored by the heritability of the family.
(B) Nodes are colored by the significance of the association families and a normal versus obese BMI. Family names are either indicated on the panel, or nodes are
given a letter code. Phylum Actinobacteria: (a) Actinomycetaceae, (b) Coriobacteriaceae; Phylum Bacteroidetes: (c) Barnesiellaceae, (d) Odoribacteraceae, (e)
Paraprevotellaceae, (f) Porphyromonadaceae, (g) Prevotellaceae, (h) Rikenellaceae; Phylum Firmicutes: (i) Carnobacteriaceae, (j) Clostridiaceae, (k) Erysipelotrichaceae, (l) Eubacteriaceae, (m) Lachnospiraceae, (n) Lactobacillaceae, (o) Mogibacteriaceae, (p) Peptococcaceae, (q) Peptostreptococcaceae, (r) Ruminococcaceae, (s) Streptococcaceae, (t) Tissierellaceae, (u) Turicibacteraceae, (v) Unclassified Clostridiales, (w) Veillonellaceae; Phylum Proteobacteria: (x)
Alcaligenaceae, (y) Enterobacteriaceae, (z) Oxalobacteraceae, (aa) Pasteurellaceae, (ab) Unclassified RF32; Phylum Verrucomicrobia: (ac) Verrucomicrobiaceae.
See also Figure S4.

no effect of BMI or interaction; Figure S5F). In a replicated experiment, using 21 new donors, the same weight differences were
observed (a significantly lower mean weight gain for the L+
compared to the O mouse recipients at day 10 postinoculation;
one-way t test, p = 0.047; Figure S5G).
794 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.
CELL 7821

Christensenella minuta Added to Donor Stool Reduces


Adiposity Gains in Recipient Mice
Based on the observation that Christensenella levels in the
previous experiment were similar to the weight gain patterns,
we performed experiments in which a donor stool lacking

Figure 6. Fecal Transplants from Obese and


Lean UK Twins to Germ-Free Mice Reveal
Levels of Christensenellaceae Posttransfer
Mirror Delayed Weight Gain

(A) Median relative abundances for OTUs classified as the genus Christensenella in the four donor
treatment groups over time in the recipient mouse
microbiotas.
(B) Principal coordinates analysis of unweighted
UniFrac distances for (1) the inoculum prior to
transplantation, (2) fecal samples at four time
points, and (3) cecal samples at day 21 posttransplant; see panel legend for color key. The
amount of variance described by the first two PCs
is shown on the axes.
(C) Mean values SEM for richness (Faiths PD) for
the microbiomes of the transplant mice plotted
against time (days postinoculation, with day 0 =
inoculation day).
(D) The mean values SEM for PC3 derived for the
same analysis as shown in (B) are plotted against
time (day 0 = inoculation day) for the four treatment
groups. The amount of variance explained by PC3
is in parentheses.
(E) Percent weight change since inoculation for
germ-free mouse recipients of 21 donor stools that
were obtained from lean or obese donors with or
without detectable M. smithii, which was used as
a marker for the Christensenellaceae consortium.
Means for each treatment group are plotted
SEM.
(F) Boxplots for percent weight changes for the
four groups at day 12 posttransplant, when
maximal weight differences were observed. Letters next to boxes indicate significant differences if
letters are different (p < 0.05). For all panels: dark
blue, L+, lean donor with methanogens; light blue,
L , lean donor lacking methanogens; dark orange,
O+, obese donor with methanogens; light orange,
O , obese donor without methanogens. We
repeated this experiment with a set of 21 new mice
and unique human donors and recovered the same
effect.
See also Figure S5.

detectable Christensenella was amended with C. minuta and


weight gain of recipient mice was monitored. One obese human
donor was selected from the 21 donors from the first transplant
experiment based on its lack of detectable OTUs assigned to the
genus Christensenella. At day 21 postgavage, mice receiving the
C. minuta treatment weighed significantly less than those that
received unamended stool (nested ANOVA, p < 0.05; Figure 7A).
Adiposity was significantly lower for mice receiving the C. minuta
treatment (nested ANOVA, p = 9.4 3 10 5, Figure 7B). Energy
content for stool collected at day 21 was not different between
treatments (data not shown).
Analysis of the microbial community by 16S rRNA gene
sequencing showed an impact on the overall community diversity that persisted over time (Figures 7C and 7D). After an initial
acclimation (20 hr), the communities within recipient mice began
to separate by treatment regardless of the effects of time and cocaging (Figures 7C, 7D, and S6). At 5 days postinoculation, the

relative abundance of C. minuta was similar to that observed in


the previous transplant experiment and persisted throughout
the duration of the study. We identified two genera that discriminated the two treatments at day 21: Oscillospira and a genus
within the Rikenellaceae were enriched in the C. minuta treatment (misclassification error rate of 0.06). Oscillospira abundances were significantly correlated with PC2 in the unweighted
UniFrac analysis of the communities (rho = 0.71, p = 0.0009;
Figure 7E), which is the PC that separates the C. minutaamended and unamended microbiotas.
DISCUSSION
Our results represent strong evidence that the abundances of
specific members of the gut microbiota are influenced in part
by the genetic makeup of the host. Earlier studies using fingerprinting approaches also reported host genetic effects (Stewart
Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. 795

CELL 7821

Total adiposity (%)

Weight change (%)

25
20
15
10

28
26
24
22
20
18

5
no added
Christensenella

20 hr
18 days

5 days
21 days

PC2 (5%)

Inoculum
11 days

live
Christensenella

no added
Christensenella

no added Christensenella
live Christensenella

PC1 (28%)

PC1 (28%)

E
Oscillospira Proportion of sequences

live
Christensenella

PC2 (5%)

Figure 7. Addition of Christensenella minuta to Donor Stool Leads to Reduced


Weight and Adiposity Gains in Recipient
Mice

***

no added Christensenella
live Christensenella
0.020

0.016

0.012

0.008
PC2

et al., 2005; Zoetendal et al., 2001), but without sequence data it is


not possible to know if the taxa shown here to be heritable were
also driving those patterns. The Turnbaugh et al. (2009) and Yatsunenko et al. (2012) studies, which are quite similar in experimental approach, reported a lack of host genetic effect on the
gut microbiome, most likely because both studies were underpowered. Nevertheless, reanalysis of the data from both studies
validated our observation that the abundances of taxa are more
highly correlated within MZ than DZ twin pairs. Thus, host genetic
interactions with specific taxa are likely widespread across human populations, with profound implications for human biology.
The most highly heritable taxon in our data set was the family
Christensenellaceae, which was also the hub of a co-occurrence
network that includes other taxa with high heritability. A notable
component of this network was the archaeal family Methanobacteriaceae. Similarly, Hansen et al. (2011) had previously identified
members of the Christensenellaceae (reported as relatives of Catabacter) as co-occurring with M. smithii. These co-occurrence
796 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.
CELL 7821

(A) Boxplot of percent weight change for germ-free


mouse recipients of a single donor stool only
(lacking detectable Christensenella in unrarefied
16S rRNA data) or the donor stool amended with
live C. minuta.
(B) Boxplots showing percent body fat for mice in
each group at day 21 (n = 12 mice per treatment).
(C and D) Principal coordinates analysis of unweighted UniFrac distances for (1) the inoculum
prior to transplantation, (2) fecal samples at five
time points posttransplant; see legend for color
key. The amount of variance described by the first
two PCs is shown on the axes. The same data
projection is shown in (C) and (D); sample symbols
are colored by time point (C) and by treatment (D).
(E) Relationship between PCs from the PCoA
analysis and levels of Oscillospira at day 21 (rho =
0.71, p = p < 0.001). Symbols are colored by
treatment.
See also Figure S6.

patterns could derive from different scenarios: for instance, multiple taxa may be
heritable and co-occur while each taxon
is affected by host genetics independently, or alternatively one (or a few) taxa
may be heritable and other taxa correlate
with host genetics due to their co-occurrence with these key heritable taxa. Further
experimental research will be required
to elucidate if the co-occurring heritable
taxa interact in syntrophic partnerships
or simply respond similarly to host-influenced environmental cues in the gut.
Our results suggest that environmental
factors mostly shape the Bacteroidetes
community, because most were not heritable. These results are consistent with those of a recent study
of Finnish MZ twins, in which levels of Bacteroides spp. were
more similar between twins when their diets were similar (Simoes
et al., 2013). Members of the Bacteroidetes have been shown to
respond to diet interventions (Wu et al., 2011; David et al., 2014)
Importantly, the family Christensenellaceae is heritable in the
Yatsunenko data set and its network is also present. This validation did not involve a directed search using the taxa identified in
this study but was made by applying the ACE model independently. In the TwinsUK as well as the Missouri twins data sets,
the majority of OTUs with the highest heritability estimates fell
within the Ruminococcaceae and Lachnospiraceae families.
The Missouri and TwinsUK studies differed somewhat in the
levels and structure of heritability, which may relate to study
size (Kuczynski et al., 2010), participant age (Claesson et al.,
2011), population (Yatsunenko et al., 2012), and/or diet (Wu
et al., 2011), all of which have been shown to affect microbiome
structure.

The high heritability of the Christensenellaceae raises questions about the nature of interactions between the host and
members of this family, but to date there is little published
work with which to infer their roles. Christensenella minuta is
Gram-negative, nonspore forming, nonmotile, and produces
SCFAs (Morotomi et al., 2012). A review of publicly available
data suggests it is present from birth and associates with a
healthy state but not with diet. Thus, although diet is a heritable
trait in the same population (Menni et al., 2013; Teucher et al.,
2007), it does not appear to be driving the heritability of the Christensenellaceae. Obesity is also strongly heritable in the TwinsUK
population, raising the question of whether the heritabilities we
saw for gut microbes were driven by BMI. To test this, we reran
the heritability calculations using residuals after regressing out
the effect of BMI and found that results of the two analyses
were highly correlated. This suggests that the effect of host genetics on Christensenellaceae abundance is independent of an
effect of BMI.
Our transplantation experiments showed a moderating effect
of methanogen-presence in the human donor on weight gain of
recipient mice, although strikingly, M. smithii did not persist in
mice. In contrast, Christensenellaceae levels in mice mirrored
their weight gain. Transfer to germ-free mice of microbiomes
from obese and lean donors generally results in greater adiposity
gains for obese compared to lean donors (Ridaura et al., 2013;
Turnbaugh et al., 2008; Vijay-Kumar et al., 2010). These studies
have not reported the methanogen or Christensenellaceae status
of the donors, so whether these microbes affect the host phenotype is unknown. M. smithii has been associated with a lean
phenotype in multiple studies (Million et al., 2012, 2013; Schwiertz
et al., 2010; Armougom et al., 2009; Le Chatelier et al., 2013),
raising the possibility that methanogens are key components
of the consortium for regulating host phenotype. The results of
our methanogen-Christensenellaceae transfer revealed that
although methanogens may be a marker for a low BMI in humans,
they are not required to promote a lean phenotype in the germfree mouse experimental model. This result suggests that methanogens may be functionally replaced by another consortium
member in the mouse, while the Christensenellaceae are not.
The results of the C. minuta spike-in experiments supported
the hypothesis that members of the Christensenellaceae promote a lean host phenotype. Addition of C. minuta also remodeled the diversity of the community. Intriguingly, Oscillospira,
which includes heritable OTUs in the TwinsUK data set and is
associated with a lean BMI, was enriched in the C. minutaamended microbiomes. How C. minuta reshapes the community
remains to be explored. The relatively low levels of C. minuta and
its profound effects on the community and the host may indicate
that it is a keystone taxon. Together these findings indicate that
the Christensenellaceae are highly heritable bacteria that can
directly contribute to the host phenotype with which they
associate.
Conclusions
Host genetic variation drives phenotype variation, and this study
solidifies the notion that our microbial phenotype is also influenced by our genetic state. We have shown that the host genetic
effect varies across taxa and includes members of different

phyla. The host alleles underlying the heritability of gut microbes,


once identified, should allow us to understand the nature of our
association with these health-associated bacteria and eventually
to exploit them to promote health.
EXPERIMENTAL PROCEDURES
Human Subjects and Sample Collection
Fecal samples were obtained from adult twin pair participants of the TwinsUK
registry (Moayyeri et al., 2013). Most participants were women (only 20 men
were included). Twins collected fecal samples at home, and the samples
were refrigerated for up to 2 days prior to their annual clinical visit at Kings College London, at which pointed they were stored at 80 C until processing.
Diversity and Phylogenetic Analyses
We amplified 16S rRNA genes (V4) from bulk DNA by PCR prior to sequencing
on the Illumina MiSeq 2 3 250 bp platform at Cornell Biotechnology Resource
Center Genomics Facility. We performed quality filtering and analysis of the
16S rRNA gene sequence data with QIIME 1.7.0 (Caporaso et al., 2010).
Predicted Metagenomes
PICRUSt v1.0.0 was used to predict abundances of COGs from the OTU abundances rarefied at 10,000 sequences per sample.
Heritability Estimations
Heritability estimates were calculated on the OTU abundances, taxon bins, nodes throughout the bacterial phylogenetic tree, a-diversity, and PICRUSt-predicted COGs using the structural equation modeling software OpenMx (Boker
et al., 2011).
Microbiota Transfer Experiments
Stool samples from the Twins UK cohort were selected as described in the
main text and inoculated into 6-week-old germ-free Swiss Webster mice via
oral gavage, with one recipient mouse per fecal donor. Mice were singlehoused. For the Christensenella minuta addition, three experiments were conducted: In the first experiment, one treatment group received only donor stool
inoculum, whereas the other received donor stool amended with 1 3 108
C. minuta cells; for the second experiment, a heat-killed C. minuta control
was added; in the third experiment, the heat-killed control was compared to
live C. minuta only (no vehicle-only control). Mice were housed four per
cage, with three cages per treatment. In all experiments, mice were provided
with autoclaved food and water ad libitum and maintained on a 12 hr light/dark
cycle. Body weight and chow consumption were monitored and fecal pellets
were collected weekly. At sacrifice, adiposity was analyzed using DEXA.
Body, mesenteric adipose tissue, and gonadal fat pad tissue weights were
collected at this time. Gross energy content of mouse stool samples was
measured by bomb calorimetry using an IKA C2000 calorimeter (Dairy One).
Wet cecal contents were weighed and resuspended in 2% (v/v) formic acid
by vortexing and measured using a gas chromatograph (HP series 6890)
with flame ionization detection.
Statistical Analysis
Values are expressed as the mean SEM unless otherwise indicated. Full
methods are described in the Supplemental Information.
ACCESSION NUMBERS
The European Bioinformatics Institute (EBI) accession numbers for the sequences reported in this paper are ERP006339 and ERP006342.
SUPPLEMENTAL INFORMATION
Supplemental Information includes Extended Experimental Procedures, six
figures, and two tables and can be found with this article online at http://dx.
doi.org/10.1016/j.cell.2014.09.053.

Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. 797


CELL 7821

AUTHOR CONTRIBUTIONS

sortium (2013). Dietary intervention impact on gut microbial gene richness. Nature 500, 585588.

R.E.L. and A.G.C. supervised the study. J.T.B. and T.D.S. helped design study
and provided comments and discussion. J.T.B. and T.D.S. oversaw collection
of samples. J.K.G., R.E.L., O.K., J.L.S., A.C.P., and J.L.W. oversaw microbial
data generation. J.K.G. performed the analysis with contributions from R.E.L.,
R.B., A.G.C., J.L.W., O.K., A.C.P., M.B., W.V.T., and R.K. J.K.G. and J.L.W.
performed microbiota transfer experiments. J.K.G., J.L.W., and R.E.L. prepared the manuscript with comments from A.G.C., T.D.S., J.T.B., R.B., and
R.K.

David, L.A., Maurice, C.F., Carmody, R.N., Gootenberg, D.B., Button, J.E.,
Wolfe, B.E., Ling, A.V., Devlin, A.S., Varma, Y., Fischbach, M.A., et al.
(2014). Diet rapidly and reproducibly alters the human gut microbiome. Nature
505, 559563.

ACKNOWLEDGMENTS
We thank Wei Zhang, Sara Di Rienzi, Lauren Harroff, Largus Angenent, Hannah
de Jong, Gabe Fox, Nick Scalfone, Ayme Spor, and Beth Bell for their help. We
also thank three anonymous reviewers for their helpful comments and MaryClaire King for her advice and encouragement. This work was funded by NIH
RO1 DK093595, DP2 OD007444, The Cornell Center for Comparative Population Genomics, the Wellcome Trust, and the European Communitys Seventh
Framework Programme (FP7/2007-2013). The study also received support
from the National Institute for Health Research (NIHR) BioResource Clinical
Research Facility and Biomedical Research Centre based at Guys and St
Thomas NHS Foundation Trust and Kings College London. R.E.L. is a Fellow
of the David and Lucile Packard Foundation and of the Arnold and Mabel
Beckman Foundation. J.K.G. is a National Academy of Sciences predoctoral
Fellow. T.D.S. is holder of an ERC Advanced Researcher Award. R.K. is a Howard Hughes Medical Institute Early Career Scientist.
Received: April 3, 2014
Revised: July 10, 2014
Accepted: September 24, 2014
Published: November 6, 2014

Eaves, L.J., Last, K.A., Young, P.A., and Martin, N.G. (1978). Model-fitting approaches to the analysis of human behaviour. Heredity (Edinb) 41, 249320.
Frank, D.N., Robertson, C.E., Hamm, C.M., Kpadeh, Z., Zhang, T., Chen, H.,
Zhu, W., Sartor, R.B., Boedeker, E.C., Harpaz, N., et al. (2011). Disease phenotype and genotype are associated with shifts in intestinal-associated microbiota in inflammatory bowel diseases. Inflamm. Bowel Dis. 17, 179184.
Frayling, T.M., Timpson, N.J., Weedon, M.N., Zeggini, E., Freathy, R.M., Lindgren, C.M., Perry, J.R., Elliott, K.S., Lango, H., Rayner, N.W., et al. (2007). A
common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316, 889894.
Frazer, K.A., Murray, S.S., Schork, N.J., and Topol, E.J. (2009). Human genetic
variation and its contribution to complex traits. Nat. Rev. Genet. 10, 241251.
Hamilton, M.J., Weingarden, A.R., Unno, T., Khoruts, A., and Sadowsky, M.J.
(2013). High-throughput DNA sequence analysis reveals stable engraftment of
gut microbiota following transplantation of previously frozen fecal bacteria.
Gut Microbes 4, 125135.
Hansen, E.E., Lozupone, C.A., Rey, F.E., Wu, M., Guruge, J.L., Narra, A.,
Goodfellow, J., Zaneveld, J.R., McDonald, D.T., Goodrich, J.A., et al. (2011).
Pan-genome of the dominant human gut-associated archaeon, Methanobrevibacter smithii, studied in twins. Proc. Natl. Acad. Sci. USA 108 (Suppl 1),
45994606.
Herbert, A., Gerry, N.P., McQueen, M.B., Heid, I.M., Pfeufer, A., Illig, T., Wichmann, H.E., Meitinger, T., Hunter, D., Hu, F.B., et al. (2006). A common genetic
variant is associated with adult and childhood obesity. Science 312, 279283.
Human Microbiome Project Consortium (2012). Structure, function and diversity of the healthy human microbiome. Nature 486, 207214.

REFERENCES
Armougom, F., Henry, M., Vialettes, B., Raccah, D., and Raoult, D. (2009).
Monitoring bacterial community of human gut microbiota reveals an increase
in Lactobacillus in obese patients and Methanogens in anorexic patients.
PLoS ONE 4, e7125.
Benson, A.K., Kelly, S.A., Legge, R., Ma, F., Low, S.J., Kim, J., Zhang, M., Oh,
P.L., Nehrenberg, D., Hua, K., et al. (2010). Individuality in gut microbiota
composition is a complex polygenic trait shaped by multiple environmental
and host genetic factors. Proc. Natl. Acad. Sci. USA 107, 1893318938.
Bevins, C.L., and Salzman, N.H. (2011). The potters wheel: the hosts role in
sculpting its microbiota. Cell. Mol. Life Sci. 68, 36753685.
Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T., Spies, J., Estabrook, R., Kenny, S., Bates, T., et al. (2011). OpenMx: An open source
extended structural equation modeling framework. Psychometrika 76,
306317.
Borody, T.J., and Khoruts, A. (2012). Fecal microbiota transplantation and
emerging applications. Nat. Rev. Gastroenterol. Hepatol. 9, 8896.
Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D.,
Costello, E.K., Fierer, N., Pena, A.G., Goodrich, J.K., Gordon, J.I., et al.
(2010). QIIME allows analysis of high-throughput community sequencing
data. Nat. Methods 7, 335336.
Claesson, M.J., Cusack, S., OSullivan, O., Greene-Diniz, R., de Weerd, H.,
Flannery, E., Marchesi, J.R., Falush, D., Dinan, T., Fitzgerald, G., et al.
(2011). Composition, variability, and temporal stability of the intestinal microbiota of the elderly. Proc. Natl. Acad. Sci. USA 108 (Suppl 1), 45864591.
Costello, E.K., Stagaman, K., Dethlefsen, L., Bohannan, B.J., and Relman,
D.A. (2012). The application of ecological theory toward an understanding of
the human microbiome. Science 336, 12551262.
Cotillard, A., Kennedy, S.P., Kong, L.C., Prifti, E., Pons, N., Le Chatelier, E., Almeida, M., Quinquis, B., Levenez, F., Galleron, N., et al.; ANR MicroObes con-

798 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.


CELL 7821

Karlsson, F.H., Tremaroli, V., Nookaew, I., Bergstrom, G., Behre, C.J., Fagerberg, B., Nielsen, J., and Backhed, F. (2013). Gut metagenome in European
women with normal, impaired and diabetic glucose control. Nature 498,
99103.
Khachatryan, Z.A., Ktsoyan, Z.A., Manukyan, G.P., Kelly, D., Ghazaryan, K.A.,
and Aminov, R.I. (2008). Predominant role of host genetics in controlling the
composition of gut microbiota. PLoS ONE 3, e3064.
Khoruts, A., Dicksved, J., Jansson, J.K., and Sadowsky, M.J. (2010). Changes
in the composition of the human fecal microbiome after bacteriotherapy for
recurrent Clostridium difficile-associated diarrhea. J. Clin. Gastroenterol. 44,
354360.
Koenig, J.E., Spor, A., Scalfone, N., Fricker, A.D., Stombaugh, J., Knight, R.,
Angenent, L.T., and Ley, R.E. (2011). Succession of microbial consortia in
the developing infant gut microbiome. Proc. Natl. Acad. Sci. USA 108 (Suppl
1), 45784585.
Koren, O., Goodrich, J.K., Cullender, T.C., Spor, A., Laitinen, K., Backhed,
H.K., Gonzalez, A., Werner, J.J., Angenent, L.T., Knight, R., et al. (2012).
Host remodeling of the gut microbiome and metabolic changes during pregnancy. Cell 150, 470480.
Kuczynski, J., Costello, E.K., Nemergut, D.R., Zaneveld, J., Lauber, C.L.,
Knights, D., Koren, O., Fierer, N., Kelley, S.T., Ley, R.E., et al. (2010). Direct
sequencing of the human microbiome readily reveals community differences.
Genome Biol. 11, 210.
Langille, M.G., Zaneveld, J., Caporaso, J.G., McDonald, D., Knights, D.,
Reyes, J.A., Clemente, J.C., Burkepile, D.E., Vega Thurber, R.L., Knight, R.,
et al. (2013). Predictive functional profiling of microbial communities using
16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814821.
Le Chatelier, E., Nielsen, T., Qin, J., Prifti, E., Hildebrand, F., Falony, G., Almeida, M., Arumugam, M., Batto, J.-M., Kennedy, S., et al.; MetaHIT Consortium (2013). Richness of human gut microbiome correlates with metabolic
markers. Nature 500, 541546.

Lee, S., Sung, J., Lee, J., and Ko, G. (2011). Comparison of the gut microbiotas
of healthy adult twins living in South Korea and the United States. Appl. Environ. Microbiol. 77, 74337437.

Schwiertz, A., Taras, D., Schafer, K., Beijer, S., Bos, N.A., Donus, C., and
Hardt, P.D. (2010). Microbiota and SCFA in lean and overweight healthy subjects. Obesity (Silver Spring) 18, 190195.

Ley, R.E., Backhed, F., Turnbaugh, P., Lozupone, C.A., Knight, R.D., and Gordon, J.I. (2005). Obesity alters gut microbial ecology. Proc. Natl. Acad. Sci.
USA 102, 1107011075.

Simoes, C.D., Maukonen, J., Kaprio, J., Rissanen, A., Pietilainen, K.H., and
Saarela, M. (2013). Habitual dietary intake is associated with stool microbiota
composition in monozygotic twins. J. Nutr. 143, 417423.

Lozupone, C.A., Hamady, M., Kelley, S.T., and Knight, R. (2007). Quantitative
and qualitative beta diversity measures lead to different insights into factors
that structure microbial communities. Appl. Environ. Microbiol. 73, 15761585.

Spor, A., Koren, O., and Ley, R. (2011). Unravelling the effects of the environment and host genotype on the gut microbiome. Nat. Rev. Microbiol. 9,
279290.

Martnez, I., Kim, J., Duffy, P.R., Schlegel, V.L., and Walter, J. (2010). Resistant
starches types 2 and 4 have differential effects on the composition of the fecal
microbiota in human subjects. PLoS ONE 5, e15046.
McKnite, A.M., Perez-Munoz, M.E., Lu, L., Williams, E.G., Brewer, S., Andreux,
P.A., Bastiaansen, J.W., Wang, X., Kachman, S.D., Auwerx, J., et al. (2012).
Murine gut microbiota is defined by host genetics and modulates variation
of metabolic traits. PLoS ONE 7, e39191.
Menni, C., Zhai, G., Macgregor, A., Prehn, C., Romisch-Margl, W., Suhre, K.,
Adamski, J., Cassidy, A., Illig, T., Spector, T.D., and Valdes, A.M. (2013). Targeted metabolomics profiles are strongly correlated with nutritional patterns in
women. Metabolomics 9, 506514.
Million, M., Maraninchi, M., Henry, M., Armougom, F., Richet, H., Carrieri, P.,
Valero, R., Raccah, D., Vialettes, B., and Raoult, D. (2012). Obesity-associated
gut microbiota is enriched in Lactobacillus reuteri and depleted in Bifidobacterium animalis and Methanobrevibacter smithii. Int J Obes (Lond) 36, 817825.
Million, M., Angelakis, E., Maraninchi, M., Henry, M., Giorgi, R., Valero, R., Vialettes, B., and Raoult, D. (2013). Correlation between body mass index and gut
concentrations of Lactobacillus reuteri, Bifidobacterium animalis, Methanobrevibacter smithii and Escherichia coli. Int J Obes (Lond) 37, 14601466.
Moayyeri, A., Hammond, C.J., Valdes, A.M., and Spector, T.D. (2013). Cohort
Profile: TwinsUK and healthy ageing twin study. Int J oEpidemiol 42, 7685.

Stewart, J.A., Chadwick, V.S., and Murray, A. (2005). Investigations into the influence of host genetics on the predominant eubacteria in the faecal microflora
of children. J. Med. Microbiol. 54, 12391242.
Teucher, B., Skinner, J., Skidmore, P.M., Cassidy, A., Fairweather-Tait, S.J.,
Hooper, L., Roe, M.A., Foxall, R., Oyston, S.L., Cherkas, L.F., et al. (2007). Dietary patterns and heritability of food choice in a UK female twin cohort. Twin
Res. Hum. Genet. 10, 734748.
Tims, S., Zoetendal, E.G., Vos, W.M., and Kleerebezem, M. (2011). Host genotype and the effect on microbial communities. In Metagenomics of the Human
Body, K.E. Nelson, ed. (New York: Springer), pp. 1541.
Tims, S., Derom, C., Jonkers, D.M., Vlietinck, R., Saris, W.H., Kleerebezem,
M., de Vos, W.M., and Zoetendal, E.G. (2013). Microbiota conservation and
BMI signatures in adult monozygotic twins. ISME J. 7, 707717.
Turnbaugh, P.J., Backhed, F., Fulton, L., and Gordon, J.I. (2008). Diet-induced
obesity is linked to marked but reversible alterations in the mouse distal gut microbiome. Cell Host Microbe 3, 213223.
Turnbaugh, P.J., Hamady, M., Yatsunenko, T., Cantarel, B.L., Duncan, A., Ley,
R.E., Sogin, M.L., Jones, W.J., Roe, B.A., Affourtit, J.P., et al. (2009). A core gut
microbiome in obese and lean twins. Nature 457, 480484.

Morotomi, M., Nagai, F., and Watanabe, Y. (2012). Description of Christensenella minuta gen. nov., sp. nov., isolated from human faeces, which forms a
distinct branch in the order Clostridiales, and proposal of Christensenellaceae
fam. nov. Int. J. Syst. Evol. Microbiol. 62, 144149.

van Nood, E., Vrieze, A., Nieuwdorp, M., Fuentes, S., Zoetendal, E.G., de Vos,
W.M., Visser, C.E., Kuijper, E.J., Bartelsman, J.F., Tijssen, J.G., et al. (2013).
Duodenal infusion of donor feces for recurrent Clostridium difficile. N. Engl.
J. Med. 368, 407415.

Muegge, B.D., Kuczynski, J., Knights, D., Clemente, J.C., Gonzalez, A., Fontana, L., Henrissat, B., Knight, R., and Gordon, J.I. (2011). Diet drives convergence in gut microbiome functions across mammalian phylogeny and within
humans. Science 332, 970974.

Vijay-Kumar, M., Aitken, J.D., Carvalho, F.A., Cullender, T.C., Mwangi, S., Srinivasan, S., Sitaraman, S.V., Knight, R., Ley, R.E., and Gewirtz, A.T. (2010).
Metabolic syndrome and altered gut microbiota in mice lacking Toll-like receptor 5. Science 328, 228231.

Papa, E., Docktor, M., Smillie, C., Weber, S., Preheim, S.P., Gevers, D., Giannoukos, G., Ciulla, D., Tabbaa, D., Ingram, J., et al. (2012). Non-invasive mapping of the gastrointestinal microbiota identifies children with inflammatory
bowel disease. PLoS ONE 7, e39242.

Wacklin, P., Makivuokko, H., Alakulppi, N., Nikkila, J., Tenkanen, H., Rabina,
J., Partanen, J., Aranko, K., and Matto, J. (2011). Secretor genotype (FUT2
gene) is strongly associated with the composition of Bifidobacteria in the human intestine. PLoS ONE 6, e20113.

Qin, J., Li, R., Raes, J., Arumugam, M., Burgdorf, K.S., Manichanh, C., Nielsen,
T., Pons, N., Levenez, F., Yamada, T., et al.; MetaHIT Consortium (2010). A human gut microbial gene catalogue established by metagenomic sequencing.
Nature 464, 5965.

Walter, J., and Ley, R. (2011). The human gut microbiome: ecology and recent
evolutionary changes. Annu. Rev. Microbiol. 65, 411429.

Qin, J., Li, Y., Cai, Z., Li, S., Zhu, J., Zhang, F., Liang, S., Zhang, W., Guan, Y.,
Shen, D., et al. (2012). A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490, 5560.
Rausch, P., Rehman, A., Kunzel, S., Hasler, R., Ott, S.J., Schreiber, S., Rosenstiel, P., Franke, A., and Baines, J.F. (2011). Colonic mucosa-associated microbiota is influenced by an interaction of Crohn disease and FUT2 (Secretor)
genotype. Proc. Natl. Acad. Sci. USA 108, 1903019035.

Wu, G.D., Chen, J., Hoffmann, C., Bittinger, K., Chen, Y.Y., Keilbaugh, S.A.,
Bewtra, M., Knights, D., Walters, W.A., Knight, R., et al. (2011). Linking longterm dietary patterns with gut microbial enterotypes. Science 334, 105108.
Yang, J., Loos, R.J., Powell, J.E., Medland, S.E., Speliotes, E.K., Chasman,
D.I., Rose, L.M., Thorleifsson, G., Steinthorsdottir, V., Magi, R., et al. (2012).
FTO genotype is associated with phenotypic variability of body mass index.
Nature 490, 267272.

Rehman, A., Sina, C., Gavrilova, O., Hasler, R., Ott, S., Baines, J.F., Schreiber,
S., and Rosenstiel, P. (2011). Nod2 is essential for temporal development of intestinal microbial communities. Gut 60, 13541362.

Yatsunenko, T., Rey, F.E., Manary, M.J., Trehan, I., Dominguez-Bello, M.G.,
Contreras, M., Magris, M., Hidalgo, G., Baldassano, R.N., Anokhin, A.P.,
et al. (2012). Human gut microbiome viewed across age and geography. Nature 486, 222227.

Ridaura, V.K., Faith, J.J., Rey, F.E., Cheng, J., Duncan, A.E., Kau, A.L., Griffin,
N.W., Lombard, V., Henrissat, B., Bain, J.R., et al. (2013). Gut microbiota from
twins discordant for obesity modulate metabolism in mice. Science 341,
1241214.

Zoetendal, E.G., Akkermans, A.D.L., Akkermans-van Vliet, W.M., Visser,


J.A.G.M.d., and Vos, W.M.d. (2001). The host genotype affects the bacterial
community in the human gastrointestinal tract. Microb. Ecol. Health Dis. 13,
129134.

Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. 799


CELL 7821

Supplemental Information
EXTENDED EXPERIMENTAL PROCEDURES
Sample Collection
All work involving human subjects was approved by the Cornell University IRB (Protocol ID 1108002388). Fecal samples were
collected at home by participants in the United Kingdom Adult Twin Registry (TwinsUK); (Moayyeri et al., 2013) in 15 ml conical tubes
and refrigerated for 1-2 days prior to the participants annual clinical visits at Kings College London (KCL). Upon arrival at KCL, the
samples were stored at 80 C and shipped by courier on dry ice to Cornell University, where they were stored at 80 C until
processing.
DNA Extraction, Amplicon Generation, and Sequencing
Genomic DNA was isolated from an aliquot of 100 mg from each sample using the PowerSoil - htp DNA isolation kit (MoBio
Laboratories Ltd, Carlsbad, CA). 16S rRNA genes were amplified by PCR from each of the 1,081 samples (245 DZ twin pairs, 171
MZ twin pairs, 2 twin pairs with no zygosity status reported, 143 unrelated individuals, and 98 samples taken from individuals at a
second, and for six individuals, a third time point) using the 515F and 806R primers for the V4 hypervariable region as previously
described (Caporaso et al., 2011). PCR reactions, carried out in duplicate, consisted of 2.5 U Easy-A high-fidelity enzyme, 1 3
buffer (Stratagene, La Jolla, CA), 10-100 ng DNA template, and 0.05 mM of each primer. Reaction conditions consisted of initial
denaturation at 94 C for 3 min followed by 25 cycles of denaturation at 94 C for 45 s, annealing at 50 C for 60 s, extension at 72 C
for 90 s, and a final extension at 72 C for 10 min. The replicate PCR reactions were combined and purified using a magnetic bead
system (Mag-Bind EZPure, Omega Bio-Tek, Norcross, GA). PCR amplicons were quantified using the QuantiT PicoGreen dsDNA
Assay Kit (Invitrogen, Carlsbad, CA). Aliquots of amplicons (at equal masses) were combined for a final concentration of approximately 15 ng/ml. DNA was sequenced using the Illumina MiSeq 2x250 bp platform at Cornell Biotechnology Resource Center
Genomics Facility.
16S rRNA Gene Sequence Analysis
Matching paired-end sequences (mate-pairs) were merged using the fastq-join command in the ea-utils software package
(Aronesty, 2011), and merged sequences over 275 bp in length were filtered out of the data set. The remaining merged sequences were analyzed using the open-source software package QIIME 1.7.0 (Quantitative Insights Into Microbial Ecology;
Caporaso et al., 2010). Quality filters were used to remove sequences containing uncorrectable barcodes, ambiguous bases,
or low quality reads (Phred quality scores % 25). We performed closed-reference OTU picking at 97% identity against the
May 2013 Greengenes database (97% OTU reference sequences), which excluded 6.2% of total sequences. The taxonomic
assignment of the reference sequence was used as the taxonomy for each OTU. We calculated a-diversity (Faiths phylogenetic diversity [Faith, 1992], Chao 1 [Chao, 1984], and Observed Species) and b-diversity (unweighted and weighted
UniFrac; [Lozupone et al., 2007] metrics and Bray-Curtis Dissimilarity [Bray and Curtis, 1957]) using the Greengenes phylogenetic tree where necessary. b-diversity was calculated with a rarefied OTU table containing 10,000 sequences per sample,
and Principal Coordinate Analysis (PCoA) on the distance matrices. b-diversity was also calculated separately on the three
most abundant bacterial families containing the most OTUs: Lachnospiraceae, Ruminococcaceae, and Bacteroidaceae. b-diversity between twin pairs was compared to unrelated individuals, where distances between unrelated individuals were only
used if the samples were sent in the same shipment (the stool samples were shipped to the laboratory at Cornell in eight
batches, twin pairs were sent in the same batch), and given the large number of unrelated pairs, we randomly sampled only
20% of the unrelated pairs. p values were calculated using the Students t test with 1,000 Monte Carlo simulations. For a-diversity measurements, means were calculated from 100 iterations using a rarefaction of 10,000 sequences per sample. We
generated summaries of the taxonomic distributions of OTUs at six levels from genus to phylum from the non-rarefied OTU
table.
Tests of Repeatability of Microbiota Measures
Ninety-eight individuals supplied two fecal samples spaced 2 to 812 days apart (average = 467 28 days), 44 of which are from
DZ twin pairs and 16 from MZ pairs. To determine if the microbial communities of the repeat samples from the same individuals
were more similar to each other than to samples from unrelated individuals, the b-diversity distances between pairs of repeat
samples were compared to the b-diversity distances between unrelated individuals. P values were calculated using the
Students t test with 1,000 Monte Carlo simulations. Microbiotas for repeat samples were more similar to each other than
pairs of samples from unrelated individuals using unweighted UniFrac, weighted UniFrac and Bray-Curtis dissimilarity (Table
S1).
Covariates
In the heritability analyses below, the gender of the participant, age at the time of collection, and the number of OTU counts per
sample (after filtering the data as mentioned above) were used as covariates in analyses. The following technical covariates
were also included in the models: identity of technician (of two), sequencing run (16 instrument runs) and shipment batch
(8 shipments).
Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. S1

Sequence-Based Traits Used in Heritability Calculations


The traits used for heritability estimates were the raw unrarefied counts for the OTUs (97% ID), and the abundances of taxa (genus,
family, order, class, and phylum bins) generated by summing counts for OTUs with the same classification. The methods for estimating heritability assume normally distributed data, so we performed steps to filter and transform the traits to meet this assumption.
OTUs or taxonomic groups shared by fewer than 50% of the individuals in the study (less than 50% of the counts are non-zero) were
excluded from further analyses. A multiple linear regression was performed where the Box-Cox transformed trait abundances (using
the PowerTransform command implemented in the R package car and an offset of 1) were regressed on the covariates listed above.
The residuals from this regression were then used for the heritability estimates. The total number of traits used in the heritability
calculations was 909.
Use of the Microbial Phylogeny in Heritability Calculations
We estimated heritability throughout the phylogeny. We obtained the phylogenetic tree from Greengenes (http://greengenes.
secondgenome.com/downloads; May 2013). This phylogenetic tree was pruned to keep only the tips corresponding to OTUs found
in our data set after identifying OTUs that matched Greengenes at 97% ID (using UCLUST to perform the reference-based OTU picking; Edgar, 2010). The abundances of the OTUs were propagated up the tree to generate abundances at each node, and these abundances were used in the heritability calculations. We used the same procedures for filtering (50% sharing), transformation, and
covariate regression as those mentioned above for the heritability calculations on the OTU and collapsed taxonomy traits. Heritability
values and their corresponding levels of significance were then displayed visually on the phylogeny using a color scale applied to the
tree branches.
Heritability Calculations
Heritability was assessed first by using intraclass correlation coefficients (ICCs) calculated within the group of MZ twins and DZ twins
for all traits. All ICC calculations were generated with the icc command from the R package irr. We used the difference of ICC between MZ and DZ twin pairs as an indication of the amount of genetic influence on the variation of the abundances for the given trait or
node. A Wilcoxon signed rank test was performed to assess the significance of the difference between the MZ and DZ OTU abundance ICC distributions. The phylogenetic relationship of the OTUs likely imposes some structure to the correlations among their
abundances (more closely related OTUs are expected to have similar attributes). To address this concern we permuted the MZ/
DZ twin pair labels 10,000 times and calculated the MZ/DZ ICCs for each OTU in the permuted data set. By permuting only the
zygosity labels, the correlation structure of the OTUs (i.e., their phylogenetic relatedness) is maintained. Then, for each permuted
data set we calculated the Wilcoxon signed rank test where the null hypothesis is that the difference between the MZ and DZ
ICCs (MZ ICC - DZ ICC) is less than or equal to 0, and the alternative hypothesis is that this difference is greater than 0. The
10,000 test statistics provide an empirical distribution of test statistics that we can compare to our actual test statistic. We obtained
a p value of 0.0006 by dividing the number of tests where the permuted test statistic was greater than the actual test statistic by
10,000. We also performed this test using 1,000 permutations on the Turnbaugh et al. and Yatsunenko et al. data sets, obtaining
P values of < 0.001 and 0.047, respectively. The trees with bar plots in Figures 2 and S2 were created using the command plotTree.wBars in the phytools R package.
We used the ACE model to estimate the heritability of the traits (Table S2) and nodes throughout the phylogenetic tree (Eaves et al.,
1978). The ACE model assumes that three sources of variance make up the total population phenotypic variance (V): genetic effects
(A), common environment (C), and unique environment (E). The heritability is defined as the proportion of total variance that is due to
genetic effects (A/V). Note that the term heritability is used in the twin-sense here: the A term is neither additive nor dominance
variance, but instead is a confounded mixture of the two. Consequently, the heritability we refer to is neither strictly speaking the narrow-sense nor the broad-sense heritability.
We used the structural equation modeling (SEM) software OpenMx (Boker et al., 2011) to calculate the full ACE model and
95% confidence intervals (Table S2). A permutation test was performed to determine the significance of the SEM heritability
estimates (A). The permutation p values were calculated by permuting the zygosity (MZ or DZ) labels for the twin pairs
10,000 times, and then the ACE model was used to calculate the heritability for each of the permuted data sets. To calculate
a p value, the number of times a heritability estimate (A) met or exceeded the observed heritability estimate was divided by
the total number of permutations performed (n = 10,000). To provide multiple testing correction in the heritability analysis, we
used the Benjamini-Hochberg algorithm in R to correct for all 909 traits tested (OTUs and collapsed taxonomy bins). The traits
with a q value < 0.1 are presented in Table S2B. If all of the traits are included in the analysis, many of the traits with a q value
below 0.1 are redundant because they represent the same taxa sampled at different levels in the phylogeny. To address this
redundancy, we recalculated the q values while omitting the OTUs from the analysis and also when including only the families
(Tables S2CS2E).
Association of Traits with BMI
We compared microbiotas of high-BMI (BMI > 30) to low-BMI (BMI < 25) individuals to determine which taxa were enriched or
depleted in each group. For each of the traits (residuals after regression of covariates, described above) we performed a t test.
p values were corrected for multiple testing using the Benjamini-Hochberg algorithm in R.
S2 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.

Using BMI as a Covariate in Heritability Analysis


Since obesity has been shown to impact the composition of the microbiota, we reran the heritability analysis on the taxa including
BMI as an additional covariate. We found a highly significant Pearsons correlation coefficient of 0.93 between the estimates with and
without BMI as a covariate. The most highly heritable traits (specifically the Christensenellaceae) maintained the high heritability with
the addition of BMI as a covariate. This analysis indicates that host genotype impacts the composition of the gut microbiome over and
above what can be attributed to host BMI. However, we note that host genetics may impact BMI through interactions with the
microbiota.
Heritability Analysis Applied to Published Twin Microbiome 16S rRNA Gene Sequence Data
16S rRNA gene sequence data for the Turnbaugh et al. (2009) and Yatsunenko et al. (2012) studies were downloaded from the QIIME
database (http://www.microbio.me/qiime/index.psp; study numbers 77 and 850 respectively). We also downloaded mapping files
containing the metadata (covariates) for the samples and the respective OTU tables built from closed reference-based OTU picking
against the GG database at 97% ID. For the Turnbaugh et al. data, if two samples were provided from the same individual, a single
sample was chosen randomly to be included in the analysis. The final Turnbaugh set consisted of 23 DZ twin pairs and 31 MZ twin
pairs, all of which were women. Ancestry and the number of sequences per sample were used as covariates. From the Yatsunenko
et al. (2012) study we only included data from the twin pairs aged 13 or older (note that all of these were located in the USA), yielding
34 DZ twin pairs and 29 MZ twin pairs. Age and the total number of sequences per sample were used as covariates, and since there
were both female and male participants in this data set, gender was also included as a covariate. We applied the ACE model to these
data and calculated the ICCs for each of these data sets as described above (Figures S2 and S3; Table S2). A p value for each was
also generated by permutation test as described above.
Co-occurrence Network
We used SparCC to calculate correlation coefficients between all bacterial and archaeal families (OTU sequence counts collapsed
at family level). Since co-occurrence calculations are sensitive to differences in sequencing depth between samples, we first rarefied the OTU table that excluded repeat samples at 80,000 sequences per sample. We also eliminated features (families) that were
found in fewer than 50% of samples. The feature elimination was done to control runtime and to reduce the potential for false
discovery of network edges. Although the rarefaction depth excludes many samples (222 remained), we chose this depth because
at lower rarefaction (i.e., < 80,000 sequences/sample), Christensenellaceae does not pass the filter of sharing by 50% of the
samples.
We ran SparCC using default settings, 500 bootstraps to assign p values, and divided the computation across 100 nodes on a large
cluster. From the pairwise correlation matrix returned by SparCC we made a co-occurrence network, where each node in the network
represents a family, and the edges between the nodes represent above-threshold the correlation coefficients between families.
The network was filtered to include only correlations with a two-tailed p value < 0.002, as assigned by SparCC. This value was
selected because it was 1/500 bootstraps and our experience with SparCC suggests a p value threshold of 0.05 produces significant
numbers of false edges. The network filtration and calculation were done using code available at http://www.github.com/wdwvt1/
correlations/.
The network was displayed using Cytoscape (Smoot et al., 2011), we used the edge-weighted (by correlation coefficient) Prefuse
Force Directed layout to display the network and reveal network modules. Negative correlations are represented by gray edges
and positive correlations by blue. The same procedure was performed on the Yatsunenko et al. (2012) data set rarefied at
929,918 sequences per sample.
Identification of Network Modules
To identify modules within the family level networks generated from the TwinsUK data set, we used the R packages flashClust and
dynamicTreeCut. FlashClust is a fast implementation of hierarchical clustering. We clustered the taxa based on the correlation coefficients (cor_matrix) returned by SparCC, where the dissimilarity matrix passed to flashClust was 1 cor_matrix. Then the function
cutreeDynamic was used to identify modules in the data set (Figure S5A).
Analysis of Christensenellaceae in Published Data Sets
To assess (i) the prevalence of Christensenellaceae in published studies, and (ii) its association with health status or dietary factors, we selected a set of studies (listed in section below) that addressed diet and/or health-related questions and also had
adequate sequence coverage to detect Christensenellaceae sequences. All had been performed prior to the incorporation of
the name Christensenellaceae into the reference databases. For each data set, the mapping files and split-library sequence files
were downloaded from the QIIME database (http://www.microbio.me/qiime/index.psp). We performed closed-reference OTU
picking at 97% against all representative sequences from the 97% ID Greengenes OTUs that were classified as Christensenellaceae. For any given sample within a study, the number of sequences matching these OTUs was divided by the total number of
sequences for that sample to yield a relative abundance of Christensenellaceae. The study titles, the QIIME database IDs of their
data sets, the comparison that we made within the data sets, the test we used, and the outcome are listed in the following
section.
Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. S3

16S rRNA Gene Data Sets Used for Association of Christensenellaceae Abundance with Health and Diet
Ref
Papa et al.
(2012)

Title
Non-invasive mapping of the gut
microbiota as a screening method
for IBD in children and young adults

Turnbaugh
et al. (2009)

A core gut microbiome in obese


and lean twins

Koenig et al.
(2011)

Succession of microbial consortia in


the developing infant gut microbiome

Muegge et al.
(2011)

Diet drives convergence in gut


microbiome functions across
mammalian phylogeny and
within humans

Wu et al.
(2011)

Linking Long-Term Dietary Patterns


with Gut Microbial Enterotypes

Martnez et al.
(2010)

QIIME
ID
1458

Comparison
within Study
Healthy versus
IBD

Statistical Test
Wilcoxon rank
sum

Result
Christensenellaceae enriched in healthy
compared to pediatric and young adult
IBD patient fecal samples (p = 0.0001)

Compared obese
versus lean (only
time point 2
samples)

Wilcoxon rank
sum one sided
(lean higher)

Lean has more than obese, but not


significant (p = 0.07135)

101

NA

NA

Christensenellaceae present at 8.6% in


mother, and 20% in the infant meconium
(first stool), and found at less than 5%
at all other time points in infant

626

Compared
diets

Kruskal-Wallis
rank sum

Christensenellaceae is enriched in
omnivores compared to herbivores
and carnivores (p = 0.017)

1011

Correlation with
all continuous
dietary info

Spearman
(BenjaminiHochberg
correction)

Nothing significantly correlated with


Christensenellaceae

Resistant Starches Types 2 and 4


Have Differential Effects on the
Composition of the Fecal Microbiota in
Human Subjects

495

Compared
dietary
treatment
among subjects

ANOVA

Resistant starch type did not affect


Christensenellaceae levels within
an individual

Koren et al.
(2012)

Host remodeling of the gut


microbiome and metabolic
changes during pregnancy

867

Correlation with
all continuous
dietary info

Spearman
(BenjaminiHochberg
correction)

Nothing significantly correlated with


Christensenellaceae

Henao-Mejia
et al. (2012)

Inflammasome-mediated dysbiosis
regulates progression of NAFLD and
obesity

909

Compared
mouse
genotypes

ANOVA

No genotype was significantly associated


with relative abundance of
Christensenellaceae

77

PICRUSt
We used PICRUSt v1.0.0 to generate predicted metagenomes for each sample (Langille et al., 2013). Counts from the rarefied OTU
Table (10,000 sequences per sample) were normalized by the known/predicted 16S gene copy number abundance and functional
prediction of Clusters of Orthologous Groups (COGs) summarized to general category letter associations was determined using precomputed files for the May 2013 Greengenes database. Relative abundances of the functional predictions were calculated and then
transformed as described above. Since the data are relative abundances rather than counts, an offset of one during the transformation can skew the data, so an optimal offset was determined by minimizing the squared skewness of the transformed data using
nlminb in R. The same covariates described above were regressed out, except the number of sequences per sample, and then
the residuals were standardized using stdres from the R package MASS. The ICC and ACE models were used to determine the heritability of the COGs using the same methods described above. We also compared the high-BMI and low-BMI individuals to determine which functions were enriched or depleted in the obese group of individuals using a t test. P values were corrected for multiple
testing using the Benjamini-Hochberg algorithm in R.
Animal Experiments
All animal experimental procedures were reviewed and approved by the Institutional Animal Care and Usage Committee of Cornell
University. Six-week old germ-free (GF) Swiss Webster mice were purchased from Taconic Farms Inc. (Hudson, NY). None of the
Taconic mice used were siblings, and there is a low probability of any cousins used within a study.
Fecal Transplants from Lean and Obese TwinsUK Donors
Stool samples that were termed methanogen-positive contained approximately 0.2%10% of sequence reads that corresponded to
methanogenic archaea. Stool samples that had no methanogen sequences were considered methanogen-negative. Under
S4 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.

anaerobic conditions in an anoxic glove box (Coy Lab Products, Grass Lake, MI), approximately 1 g of stool was resuspended in 15 ml
of anaerobic PBS that contained 2 mM DTT as a reducing agent. Each stool sample was vortexed for 5 min, removed from the anaerobic chamber, and then immediately used. In the initial experiment, we randomly assigned 21 (14 male, 7 female) 6-week-old Swiss
Webster germ-free mice (Taconic Farms) to one donor each such that initial mouse mean weights were equivalent between treatment
groups. Immediately prior to inoculation, the stool suspension was inverted 3 times and 500 ml were drawn up into a syringe fitted with
a 20G gavage needle; 300 ml were stored for subsequent DNA extraction and analysis, whereas the remaining 200 ml was immediately
inoculated into the recipient mouse via oral gavage. Fecal material from each donor was orally administered by gavage to 6-week old
germ-free Swiss Webster mice in a 1:1 donor:mouse ratio. Mice were single-housed, kept under a 12 hr light/dark cycle, and fed an
autoclaved 7017 NIH-31 mouse diet produced by Harlan Teklad (Madison, WI) ad libitum. Body weight and chow consumption were
monitored weekly, where chow was measured before and after cage changes. Chow consumption rates were not different between
treatment groups. A single mouse that had no remaining food in the cage at day 19 lost weight and was removed from any analysis at
day 19. Stool samples were harvested weekly and immediately placed on dry ice.
We replicated the experiment using stool samples from a set of 21 new donors, chosen similarly (by BMI and methanogen carriage). Again, 21 mice (female 6-week-old Swiss Webster germ-free mice) were each assigned to a unique donor. Over the duration
of the replication 3 mice died and were excluded from the data set, leaving 5 L+, 4 L-, 3 O+, and 5 O- recipient mice. Sample collection
and weight measurement were performed 20 hr, 5 days, and 10 days after inoculation as described above.
Fecal Transplants of C. minuta Amended Microbiome
This experiment was similar to the obese/lean transfer described above, except for the following differences: (i) all mice were female
(n = 24) and housed 4 per cage, with 3 cages per treatment; (ii) a single obese subject was selected as the donor based on a lack
of OTUs mapping to Christensenella (i.e., none out of 478,633 sequences obtained for that sample when the inoculum used in
the transplant was sequenced). C. minuta (DSM 22607) was grown in brain heart infusion broth supplemented with yeast (5 g/l),
menadione (1 mg/l), hemin (10 mg/l), and L-cysteine-HCL (0.5 g/l) at 37 C under anaerobic conditions. Stool suspensions were
prepared as above, with the exception that the mice receiving C. minuta were given an inoculum containing an addition of approximately 1 3 108 C. minuta cells, and the donor stool lacking C. minuta was amended with the same volume of PBS as a vehicle
control.
The second C. minuta addition experiment was similar to the first, but had 21 mice that were divided into 3 treatments: minus
C. minuta, plus C. minuta, and plus heat-killed C. minuta. The minus and plus C. minuta samples were prepared as described
in the first experiment. To prepare the heat-killed C. minuta inoculum, the culture was autoclaved for 20 min, and the donor stool was
amended to contain approximately 1 3 108 C. minuta heat-killed cells. There were 7 mice per treatment group and mice were divided
into 2 cages per treatment, one containing 3 mice and the other cage containing 4.
The third C. minuta addition experiment also contained 21 mice, with 10 mice receiving an inoculum of donor stool amended with
heat-killed C. minuta that was prepared as described above, and 11 mice receiving donor stool amended with live C. minuta, prepared as above. Mice were housed 2 per cage (within the same treatment group), with the exception that one of the plus
C. minuta cages contained 3 mice.
Percent Body Fat
Directly after euthanasia, mice were scanned by DEXA (Lunar PIXImus Mouse, GE Medical Systems, Waukesha,WI).
Total Energy and Free Short-Chain Fatty Acid Measurements
Gross energy content of mouse stool samples was measured by bomb calorimetry using an IKA C2000 calorimeter (Dairy One,
Ithaca, NY). Wet cecal contents were weighed and resuspended in 2% (v/v) formic acid by vortexing. The sample was centrifuged
at 15,000 rpm for 5 min and the resulting supernatant was syringe filtered using a 0.22 mm filter to remove solids. One ml was injected
into the gas chromatograph (HP series 6890) with a flame ionization detector. The temperatures of the injector and detector were
200 C and 275 C, respectively. The column temperature was increased from 70 C to 200 C at a rate of 12 C /min. SCFAs were separated using a Nukol capillary column (fused silica, 15 m x 0.53 mm x 0.5 mm, Supelco), using helium as the carrier gas at 21.4 ml/min.
Mouse Recipient Fecal and Cecal Bacterial Diversity
DNA was extracted from frozen mouse cecal and fecal pellets, and from aliquots of the gavage preparation (inoculum), as described
above. 16S rRNA gene sequences were obtained by PCR, sequenced, and analyzed as described above. Data for the obese/lean
donor transplant experiments were rarefied to 55,000 sequences per sample, and data for the Christensenella addition experiments
were rarefied to 11,228 sequences per sample.
SUPPLEMENTAL REFERENCES
Aronesty, E. (2011). ea-utils: Command-line tools for processing biological sequencing data (http://code.google.com/p/ea-utils).
Bray, J.R., and Curtis, J.T. (1957). An ordination of the upland forest communities of southern wisconsin. Ecol. Monogr. 27, 326349.

Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. S5

Caporaso, J.G., Lauber, C.L., Walters, W.A., Berg-Lyons, D., Lozupone, C.A., Turnbaugh, P.J., Fierer, N., and Knight, R. (2011). Global patterns of 16S rRNA
diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. USA 108 (Suppl 1), 45164522.
Chao, A. (1984). Nonparametric estimation of the number of classes in a population. Scand. J. Stat. 11, 265270.
Edgar, R.C. (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 24602461.
Faith, D.P. (1992). Conservation evaluation and phylogenetic diversity. Biol. Conserv. 61, 110.
Henao-Mejia, J., Elinav, E., Jin, C., Hao, L., Mehal, W.Z., Strowig, T., Thaiss, C.A., Kau, A.L., Eisenbarth, S.C., Jurczak, M.J., et al. (2012). Inflammasome-mediated dysbiosis regulates progression of NAFLD and obesity. Nature 482, 179185.
Smoot, M.E., Ono, K., Ruscheinski, J., Wang, P.L., and Ideker, T. (2011). Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27, 431432.

S6 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.

B
**

ns

1.0
0.8

0.6

0.4

0.6

0.4

0.3

0.5

0.2

0.2

0.1

0.2

0.0
MZ

DZ

UN

MZ

DZ

MZ

UN

E
*
***

ns

1.0

***
*

ns

0.8

0.8

0.8

0.6

0.6

0.6

0.4

0.4

0.2

0.2
DZ

UN

***

1.0

1.0

All Bacteria
and Archaea

UN

***

MZ

DZ

More different

0.4

**

More similar

More different

ns

0.8

ns
Bray Curtis distance

***
ns

***
Unweighted UniFrac distance

ns

ns

More similar

***
Weighted UniFrac distance

More similar

More different

ns

***

0.8
0.7
0.6
0.5
0.4
MZ

DZ

UN

Lachnospiraceae

0.4
MZ

DZ

UN

Bacteroidaceae

MZ

DZ

UN

Ruminococcaceae

Figure S1. Heritability of the Gut Microbiota in Twins, Related to Figure 1 and Table S1
(AC) Boxplots of weighted UniFrac distances between microbial communities, using (A) the entire phylogenetic tree, (B) the family Bacteroideaceae and (C) the
Ruminococcaceae.
(DF) Boxplots of Bray-Curtis dissimilarity indices between fecal microbial communities, using (D) the entire phylogenetic tree, (E) the family Bacteroideaceae and
(F) the Ruminococcaceae.
(G) Boxplot of unweighted UniFrac distances for the family Lachnospiraceae. MZ, monozygotic; DZ, dizygotic; UN, unrelated individuals. *p < 0.05, **p < 0.01,
***p < 0.001 (Students t test with 1,000 Monte Carlo simulations).

Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. S7

Figure S2. MZ Twins Have Higher Correlations for OTU Abundances Than DZ Twins in the Turnbaugh and Yatsunenko Data Sets, Related to
Figure 2
(A and B) The difference in intraclass correlation coefficients (ICC) for MZ twin pairs versus DZ twin pairs for each OTU is displayed at the right for each tip of the
phylogeny. Bars pointing to the right indicate that the difference is positive (i.e., MZ ICCs > DZ ICCs) and bars pointing to the left indicate negative differences (DZ
ICCs > MZ ICCs).
(C and D) Distribution of twin-pair ICCs for OTU abundances. MZ bars are black, DZ bars are pale gray, and their overlap is shaded dark gray. A, C: ICCs
calculated for the Turnbaugh data. B, D: ICCs calculated for the Yatsunenko data.

S8 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.

Figure S3. Heritability of Microbial Abundances in Published Twin Data Sets, Related to Figure 3 and Table S2
(AD) Heritabilities were calculated using data from Turnbaugh et al. (2009) and Yatsunenko et al. (2012). Heritability of microbial taxa for twin pairs from the
Turnbaugh data set (A and B) and the Yatsunenko (C and D) data set. (A and C) Heritabilities were estimated as the proportion of variance in the microbial
abundances that can be attributed to genetic effects (A, from the ACE model). The heritability estimates are mapped onto the phylogenetic tree shown in Figure 3
and displayed using a rainbow gradient from blue (A = 0) to red (A R 0.4). The branches are colored gray if they do not include at least 50% of the participants for
the respective study. (B and D) The significance for the heritability values calculated for the respective studies was determined using a permutation test (n = 1,000)
and are shown on the same phylogeny. P values range from 0 (red) to > 0.1 (blue).
(E) Scatterplot showing the relationship between OTU heritability and the stability of each OTU assessed by ICCs between repeat samples.

Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. S9

A
Prevotellaceae
[Paraprevotellaceae]
Desulfovibrionaceae
[Tissierellaceae]
Enterobacteriaceae
Eubacteriaceae
Bifidobacteriaceae
Coriobacteriaceae
Ruminococcaceae
Lachnospiraceae
Erysipelotrichaceae
Carnobacteriaceae
Actinomycetaceae
Lactobacillaceae
Streptococcaceae
Turicibacteraceae
Clostridiaceae
Peptostreptococcaceae
Veillonellaceae
Pasteurellaceae
Rikenellaceae
[Barnesiellaceae]
Porphyromonadaceae
[Odoribacteraceae]
Bacteroidaceae
Alcaligenaceae
[Mogibacteriaceae]
Oxalobacteraceae
Verrucomicrobiaceae
Unclassified SHA-98
Methanobacteriaceae
Christensenellaceae
Dehalobacteriaceae
Unclassified RF32
Unclassified ML615J-28
Peptococcaceae
Unclassified Clostridiales
Unclassified RF39

B
Actinobacteria;
Bifidobacteriaceae

Tenericutes;
Unclassified RF39

an
u

Firmicutes;
Unclassified SHA-98

ab ap

Euryarchaeota;
Methanobacteriaceae
Firmicutes;
Dehalobacteriaceae

av
ae
ai
am al s
aj
b
m i ak a
k
af
aq
ad
y
ag
t
ao
ar
au
l

ac

o
Tenericutes;
Unclassified
ML615J-28

p
c

at

aa

x
Firmicutes;
Christensenellaceae

ah

as
g
e

0.0

Heritability(A) >0.8

Bacteroidetes;
Bacteroidaceae

Figure S4. Christensenellaceae Network in the Yatsunenko Data Set, Related to Figure 5
(A) The three TwinsUK microbial family co-occurrence network modules identified by the DynamicTreeCut R package. The colors on the top of the heatmap
represent the three modules. The coloring on the left of the heatmap is the heritability estimate of each microbial family. The heatmap shows the correlation
structure between all taxa calculated using SparCC.
(B) A network built from SparCC correlation coefficients between sequence abundances collapsed at the family level in the Yatsunenko data set. The nodes
represent each family and the edges represent the correlation between families. Edges are colored blue for a positive correlation and gray for a negative correlation, and the weight of the edge reflects the strength of the correlation. Nodes are positioned using an edge-weighted force directed layout. The nodes are
colored by the heritability of the family. Refer to Figure 4 legend for family lettering, additional families are: (ad) Actinobacteria; Corynebacteriaceae, (ae) Actinobacteria; Micrococcaceae, (af) Bacteroidetes; RF16, (ag) Bacteroidetes; Sphingobacteriaceae, (ah) Bacteroidetes; Unclassified Bacteroidales, (ai) Firmicutes;
Aerococcaceae, (aj) Firmicutes; Bacillaceae, (ak) Firmicutes; Gemellaceae, (al) Firmicutes; Planococcaceae, (am) Firmicutes; Staphylococcaceae, (an) Firmicutes; Unclassified Lactobacillales, (ao) Fusobacteria; Fusobacteriaceae, (ap) Lentisphaerae; Victivallaceae, (aq) Proteobacteria; Aeromonadaceae, (ar) Proteobacteria; Campylobacteraceae, (as) Proteobacteria; Comamonadaceae, (at) Proteobacteria; Desulfovibrionaceae, (au) Proteobacteria; Rhodocyclaceae, (av)
Unclassified TM7.

S10 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.

Figure S5. Effects of Methanogen Presence in Donor Stool, Related to Figure 6


(A and B) Principal coordinates analysis of unweighted UniFrac distances for human fecal microbiota from obese and lean donors with and without methanogens
before and after transplantation to germ-free mice. Points represent samples obtained from (i) the inoculum prior to transplantation, (ii) fecal samples at 4 time
points, and (iii) cecal samples at day 21. The points are colored according to the treatment group. Panel (A) shows the same data projection as in Figure 5C, with
points colored by treatment group. Panel (B) shows PC3 plotted against PC2. The amount of variance described by the PCs is shown on the axes.
(CE) Box plots showing concentrations of propionate (C), butyrate (D) and acetate (E) measured in the ceca of mice 21 days post inoculation.
(F) Box plots of gross energy content for dry stool collected at day 12. Boxes with different letters adjacent to them have significantly different means, p < 0.05.
(G) Percent weight change over time for germ-free mouse recipients. Donor stools were obtained from lean or obese donors with or without detectable
methanogens and did not include any donors used in the initial experiment (Figure 6A). Mean values SEM. For all panels, Dark blue = L+, lean donor with
methanogens; Light blue = L-, lean donor lacking methanogens; Dark orange = O+, obese donor with methanogens; Light orange = O-, obese donor without
methanogens.

Cell 159, 789799, November 6, 2014 2014 Elsevier Inc. S11

B
35

Weight change (%)

Weight change (%)

35
30
25
20
15
10

25
20
15
10

0
no added
Christensenella

heat-killed
Christensenella

live
Christensenella

live
Christensenella

D
35

30

Total adiposity (%)

Weight change (%)

30

30
25
20
15
10

28
26
24
22
20

18

16
heat-killed
Christensenella

live
Christensenella

heat-killed
Christensenella

live
Christensenella

PC2 (5%)

Minus cage #1
Minus cage #2
Minus cage #3
Plus cage #1
Plus cage #2
Plus cage #3
Minus inoculum
Plus inoculum
PC1 (28%)

Figure S6. Phenotype Effects of Christensenella Addition, Related to Figure 7


(A and B) First repeat of the addition experiment. Panels (A) and (B) are box plots that show percent weight change at day 23 relative to the starting mouse weight
for 6-wk old Swiss-Webster germ-free mice inoculated with stool from an obese donor lacking Christensenella with vehicle control versus live C. minuta addition
(A) or heat-killed control versus C. minuta addition (B). Note that the live C. minuta data are the same data represented in both panels (A) and (B).
(C and D) A second repeat (third iteration) of the experiment is shown in panels (C) and (D), which show the relative weight change at day 21 (C) and total adiposity
of mice at day 21 (D). N = 7-11 mice per treatment.
(E) The PCoA plot of the unweighted UniFrac distances for the 16S rRNA analysis of samples derived from the first C. minuta addition experiment. Both the donor
inoculum and 5 time points from mouse stool are shown. Points are colored by cage to show the co-caging effects on the mice.

S12 Cell 159, 789799, November 6, 2014 2014 Elsevier Inc.