Professional Documents
Culture Documents
1 s2.0 S0960982216300641 Main
1 s2.0 S0960982216300641 Main
Report
*Correspondence: aimee.dudley@gmail.com
http://dx.doi.org/10.1016/j.cub.2016.02.012
SUMMARY RESULTS
Modern transportation networks have facilitated the S. cerevisiae Is Readily Isolated from Unroasted Cacao
migration and mingling of previously isolated popu- and Coffee Beans
lations of plants, animals, and insects. Human activ- To test the extent to which human activity may have influenced
ities can also influence the global distribution of the microorganisms associated with coffee and cacao fermenta-
microorganisms. The best-understood example is tion, we focused on one microbe, S. cerevisiae. The importance
of yeast in cacao fermentation is clear [13]. Cacao undergoes
yeasts associated with winemaking. Humans began
5–7 days of fermentation, seeded from the local flora during ma-
making wine in the Middle East over 9,000 years
nipulations of the cacao pods. Successions of yeasts, lactic acid
ago [1, 2]. Selecting favorable fermentation products bacteria, and acetic acid bacteria digest the pectinaceous pulp
created specialized strains of Saccharomyces cere- surrounding the beans and trigger biochemical changes that
visiae [3, 4] that were transported along with grape- impart flavor and color to the beans [13–15]. In contrast, the
vines. Today, S. cerevisiae strains residing in vine- microbiota of coffee fermentation is poorly understood, and
yards around the world are genetically similar, and only a few studies have detected yeasts in coffee fermentations
their population structure suggests a common origin [9, 15–18]. Coffee growers use different types of fermentation to
that followed the path of human migration [3–7]. Like digest the cherry pulp surrounding the beans. The most common
wine, coffee and cacao depend on microbial fermen- are ‘‘wet’’ 24–48 hr fermentation in water [16, 18] and ‘‘dry’’ 10–
tation [8, 9] and have been globally dispersed by hu- 25 day fermentation with rounds of spreading and heaping on a
platform [18]. Without direct access to the fermentations, we
mans. Theobroma cacao originated in the Amazon
attempted to culture live yeast from unroasted coffee and cacao
and Orinoco basins of Colombia and Venezuela
beans grown and processed in a variety of geographic locations
[10], was cultivated in Central America by Mesoamer- and ecological niches. From beans grown in Central America,
ican peoples, and was introduced to Europeans by South America, Africa, Indonesia, or the Middle East, we ob-
Hernán Cortés in 1530 [11]. Coffea, native to Ethiopia, tained 78 cacao strains from 13 countries and 67 coffee strains
was disseminated by Arab traders throughout the from 14 countries (Tables 1 and S1). For brevity, we will refer
Middle East and North Africa in the 6th century and to strains isolated from countries in Central and South America
was introduced to European consumers in the 17th as ‘‘South American’’ based on similarities in climate and
century [12]. Here, we tested whether the yeasts geographic proximity. Both cacao and coffee bean cultures
associated with coffee and cacao are genetically also contained a variety of bacteria, yeasts, and filamentous
similar, crop-specific populations or genetically fungi, suggesting that this approach could be readily adapted
for the isolation of other microorganisms.
diverse, geography-specific populations. Our results
uncovered populations that, while defined by niche
Coffee and Cacao Yeasts Form Discrete, Genetically
and geography, also bear signatures of admixture
Diverse Populations
between major populations in events independent To measure the genetic diversity of the yeast strains associated
of the transport of the plants. Thus, human-associ- with coffee and cacao beans, we used restriction site-associ-
ated fermentation and migration may have affected ated DNA sequencing (RAD-seq) [20] to sequence the same
the distribution of yeast involved in the production 3% of each strain’s genome. We compared this sequence
of coffee and chocolate. to polymorphism information across the same regions of the
Current Biology 26, 965–971, April 4, 2016 ª2016 Elsevier Ltd All rights reserved 965
Table 1. The Origin and Number of S. cerevisiae Strains from and 1D). In fact, virtual predictions of sample origin (where the
Coffee and Cacao Analyzed in This Study provenance of a strain is predicted based on that of the most
Isolates genetically similar strain) were accurate 86% of the time for ca-
cao strains and 79% of the time for coffee (Supplemental Exper-
Cacao Origin
imental Procedures). These results support the hypothesis that
Colombia 1
the yeasts isolated from imported bean samples originated
Costa Rica 11 from the local fermentations rather than from cross-contamina-
Dominican Republic 7 tion during distribution and suggest the existence of coffee-
Ecuador 12 and cacao-specific yeasts that differ based on the location
Ghana 5 where the beans were grown and fermented.
Haiti 7 To test this hypothesis more rigorously, we analyzed the pop-
Ivory Coast 6 ulation structure of the coffee and cacao yeasts using the Mar-
kov chain Monte Carlo algorithm InStruct [21]. This analysis
Madagascar 1
included RAD-seq data from the coffee and cacao strains
Nicaragua 2
isolated in this study, a previously published set of 262
Nigeria 7 S. cerevisiae strains from a variety of geographic origins and
Papua New Guinea 2 ecological niches [19], and 57 additional strains, including a
Peru 9 set of 13 Chinese tree strains (group 1), used to root and extend
Venezuela 8 the phylogenetic tree (Table S1). The deviance information cri-
Coffee Origin terion (Experimental Procedures) suggested 12 as the most
Colombia 6 likely number of populations (Figure S1). The population struc-
ture generated (Figures 2 and S2) largely agreed with previous
Costa Rica 3
analyses of data generated by microarray hybridization [6],
Ethiopia 3
RAD-seq [19], and whole-genome sequencing [5, 22], with
Guatemala 2 five populations (Table S2) that, except for the addition of novel
Honduras 14 strains, are largely unchanged from our previous analysis [19].
Indonesia 12 These include North America oak (group 3), New Zealand soil
Kenya 2 (group 11), Israel soil (group 12), Asia Mixed (group 7), and
Mexico 2 Pan Mixed 2 (group 6). The large European wine population
Nicaragua 3 expanded slightly to include strains previously assigned to other
populations, and the new Pan Mixed 1 group was formed by
Peru 5
strains that had been placed in other groups in our previous
Rwanda 5
analysis. With the exception of Pan Mixed 2, a human-associ-
Uganda 4 ated population with extensive aneuploidy, polyploidy, and
United States 1 heterozygosity (Figure S3), these populations harbor relatively
Yemen 5 few strains isolated from coffee or cacao beans (Table S3 and
The five Ghana cacao strains were isolated [14] and analyzed by RAD-seq Figure S2).
[19] in previous studies. All others were isolated in this study. See also Most coffee and cacao strains resided in four new populations
Table S1. of which they are the majority and in some cases exclusive mem-
bers: South America cacao (group 4), Africa cacao (group 8),
South America coffee (group 9), and Africa coffee (group 5).
genome for 35 wine strains from a previously published RAD-seq The population structure of cacao and coffee strains is signifi-
dataset [19]. Our results (Figures 1A and 1B) show a sharp cant for several reasons. First, in contrast to the wine strains,
contrast between the wine, coffee, and cacao strains. Similar the coffee and cacao yeasts form multiple, discrete populations.
to previous studies [4–6], we observed limited genetic diversity The fact that not all coffee or cacao strains are related suggests
between wine strains, with a median pairwise p distance of independent origins for distinct populations of yeast associated
1.28e-3 (Experimental Procedures). The yeast strains isolated with the same human-associated activities. Second, while these
from both coffee and cacao beans exhibited significantly greater populations reflect ecological niche, i.e., coffee versus cacao
diversity than the wine yeast strains did, with median pairwise p (Table S3 and Figure S2), they also reflect geographical origin
distances of 3.39e-3 and 2.95e-3, respectively (p = 9.5e-162 and (Figure 2). The coffee strains provide the clearest example,
p = 4.6e-156, Mood’s median test). with South American and African coffee strains each forming a
The greater genetic diversity of coffee and cacao strains rela- single population (Table S3) that further clusters by country (Fig-
tive to wine strains is less consistent with a crop-specific origin of ure 1C). While the cacao strains also showed strong country-
these yeasts but is by no means conclusive evidence against it. level clustering (Figure 1D), each of the two major populations
To explore this question in more detail, we examined the genetic include strains from samples whose declared continent of origin
similarity of strains isolated from geographically proximal loca- is different from other population members (Table S3). This may
tions. Both coffee and cacao strains show strong country-level reflect a more complex pattern of migration events for cacao
clustering, even though bean samples were obtained from mul- strains than for coffee strains, although mislabeling of sample
tiple suppliers in different places at different times (Figures 1C origin is also possible.
966 Current Biology 26, 965–971, April 4, 2016 ª2016 Elsevier Ltd All rights reserved
Figure 1. Genetic Diversity of Yeast Strains Isolated from Different Niches
(A) MDS (multidimensional scaling) plot of genetic distance (Euclidian) between all coffee, cacao, and vineyard strains using first and second principal coordinates
(x and y axes), which explain 18% and 7% of variation among strains, respectively.
(B) Distributions of the pairwise p distances between strains.
(C) MDS plot of genetic distance for coffee strains where the first and second principal coordinates (x and y axes) explain 22% and 8% of variation among strains,
respectively.
(D) MDS plot of genetic distance between cacao strains where the first and second principal coordinates (x and y axes) explain 18% and 11% of variation among
strains, respectively.
See also Table S4.
Admixture and Migration Events locations and ecological niches, particularly because (with the
Because their population structure suggested that the yeasts exception of vineyards) the Southern Hemisphere had been
associated with coffee and cacao beans are members of new largely uncharacterized. Interestingly, the new coffee and cacao
groups, we sought to understand the origin of these populations. populations were not composed of strains with novel alleles
Previous analyses had identified three major populations of (Figure S2) but were instead admixtures of the three known
S. cerevisiae (North American oak, which is related to a Japanese yeast populations. Furthermore, these admixtures roughly corre-
oak population [23], European vineyard, and Asian populations) sponded to the geographic proximity of the samples’ origin and
and strains with substantial admixture between them [19]. We patterns of human migration. For example, the two South Amer-
anticipated finding novel populations of yeast by sampling new ican populations (SA coffee and SA cacao) share alleles with the
Current Biology 26, 965–971, April 4, 2016 ª2016 Elsevier Ltd All rights reserved 967
Figure 2. Genetic Distance between Yeast
Strains
Each strain is plotted along first and second
principal coordinates obtained from MDS of ge-
netic distance (Euclidian). Populations are labeled
by the most common source and/or geographic
location from which they were originally isolated.
Populations containing strains isolated from Cen-
tral and South America are labeled South America.
Each strain is color coded by populations inferred
from InStruct with acronyms NA (North America),
SA (South America), NZ (New Zealand). The
first and second principal coordinates explain
22% and 7% of variation among strains, respec-
tively. See also Figures S1–S3 and Tables S2, S3,
and S5.
968 Current Biology 26, 965–971, April 4, 2016 ª2016 Elsevier Ltd All rights reserved
Figure 3. Maximum-Likelihood Tree of
Genetic Relationships among Populations
and Admixture Events Based on the
TreeMix Model
Branch lengths are proportional to drift in allele
frequencies between populations, shown by the
scale along with the standard error (s.e.) of the
sample covariance matrix. Arrows show migration
events resulting in admixture: red arrows indicate
migration events that passed the significance
threshold of the three-population test (f3), while
black arrows indicate predicted migration events
that were not statistically significant. The yellow
node indicates a TreeMix migration event from the
cluster that contains both the NA oak and Asian
groups to SA coffee. Migration events from both
of these populations to SA coffee were significant
by the f3 test.
Current Biology 26, 965–971, April 4, 2016 ª2016 Elsevier Ltd All rights reserved 969
a burn of 20,000 iterations followed by an additional 20,000 iterations. We 6. Schacherer, J., Shapiro, J.A., Ruderfer, D.M., and Kruglyak, L. (2009).
chose the optimal number of populations by requiring the change in the devi- Comprehensive polymorphism survey elucidates population structure of
ance information criterion (DIC) to be greater than the standard error in DIC Saccharomyces cerevisiae. Nature 458, 342–345.
from 20 independent runs. 7. Goddard, M.R., Anfang, N., Tang, R., Gardner, R.C., and Jun, C. (2010).
p distance was calculated as the proportion of non-identical genotypes A distinct population of Saccharomyces cerevisiae in New Zealand: evi-
observed across all 223,311 typed base positions. Heterozygous-homozy- dence for local dispersal by insects and human-aided global dispersal in
gous differences were given a value of 0.5. oak barrels. Environ. Microbiol. 12, 63–73.
Multidimensional scaling and cluster analysis were used to further visualize
8. Papalexandratou, Z., Falony, G., Romanens, E., Jimenez, J.C., Amores,
the relationships of new and admixed populations to those previously identi-
F., Daniel, H.M., and De Vuyst, L. (2011). Species diversity, community
fied [19]. Multidimensional scaling was applied to all sites using Euclidian
dynamics, and metabolite kinetics of the microbiota associated with tradi-
identity-by-state distance (with state distance between non-identical homozy-
tional ecuadorian spontaneous cocoa bean fermentations. Appl. Environ.
gous/hemizygous alleles encoded as 2 and between heterozygous and homo-
Microbiol. 77, 7698–7714.
zygous sites encoded as 1) and the ‘‘cmdscale’’ function in R. Cluster analysis
was applied to 2,615 sites with minor allele frequencies of 1% or greater and 9. Silva, C.F., Batista, L.R., Abreu, L.M., Dias, E.S., and Schwan, R.F. (2008).
less than 1% missing data using hierarchical complete linkage clustering. For Succession of bacterial and fungal communities during natural coffee
comparisons between wine/vineyard, coffee, and cacao yeast strains, any (Coffea arabica) fermentation. Food Microbiol. 25, 951–957.
vineyard or winemaking strains associated with Vitis species other than Vitis 10. Motamayor, J.C., Lachenaud, P., da Silva E Mota, J.W., Loor, R., Kuhn,
vinifera were excluded. D.N., Brown, J.S., and Schnell, R.J. (2008). Geographic and genetic pop-
ulation differentiation of the Amazonian chocolate tree (Theobroma cacao
ACCESSION NUMBERS L). PLoS ONE 3, e3311.
11. Young, A.M. (1994). The Chocolate Tree: A Natural History of Cacao
The accession number for the read sequences generated for this study (Smithsonian Institution Press).
is European Nucleotide Archive: PRJEB12530.
12. Pendergrast, M. (1999). Uncommon Grounds: The History of Coffee and
How It Transformed Our World (Basic Books).
SUPPLEMENTAL INFORMATION
13. Schwan, R.F., and Wheals, A.E. (2004). The microbiology of cocoa
Supplemental Information includes three figures, five tables, and Supple- fermentation and its role in chocolate quality. Crit. Rev. Food Sci. Nutr.
mental Experimental Procedures and can be found with this article online at 44, 205–221.
http://dx.doi.org/10.1016/j.cub.2016.02.012. 14. Jespersen, L., Nielsen, D.S., Hønholt, S., and Jakobsen, M. (2005).
Occurrence and diversity of yeasts involved in fermentation of West
AUTHOR CONTRIBUTIONS African cocoa beans. FEMS Yeast Res. 5, 441–453.
15. Schwan, R.F., and Wheals, A.E. (2003). Mixed microbial fermentations of
Conceptualization, A.M.D., C.L.L., and C.G.-T.; Methodology, G.A.C., J.C.F.,
chocolate and coffee. In Yeasts in Food, T. Boekhout, and V. Robert,
E.W.J., and C.L.L.; Investigation, G.A.C., J.C.F., C.F., M.H., E.W.J., C.L.L.,
eds. (CRC Press).
A.S., and C.G.-T.; Writing – Original Draft, C.L.L.; Writing – Review & Editing,
A.M.D. and C.L.L.; Funding Acquisition, A.M.D. and J.C.F.; Supervision, A.M.D. 16. Agate, A.D., and Bhat, J.V. (1966). Role of pectinolytic yeasts in the degra-
dation of mucilage layer of Coffea robusta cherries. Appl. Microbiol. 14,
ACKNOWLEDGMENTS 256–260.
17. Masoud, W., Cesar, L.B., Jespersen, L., and Jakobsen, M. (2004). Yeast
The authors thank Amir Sherman (Agricultural Research Organization) for help- involved in fermentation of Coffea arabica in East Africa determined
ful discussions, Dan Ollis and Perry Hook of Victrola Coffee for gifts of coffee by genotyping and by direct denaturating gradient gel electrophoresis.
beans, Andy McShea of Theo Chocolate and John Nanci of Chocolate Yeast 21, 549–556.
Alchemy for gifts of cacao beans, and Feng-Yan Bai (Chinese Academy of Sci- 18. Silv, C.F., Schwan, R.F., Sousa Dias, E.S., and Wheals, A.E. (2000).
ences) and James Swezey (USDA/ARS) for providing yeast strains. This work Microbial diversity during maturation and natural processing of coffee
was funded by a strategic partnership between the University of Luxembourg cherries of Coffea arabica in Brazil. Int. J. Food Microbiol. 60, 251–260.
and the Institute for Systems Biology and by NIH grant GM080669 to J.C.F.
19. Cromie, G.A., Hyma, K.E., Ludlow, C.L., Garmendia-Torres, C., Gilbert,
T.L., May, P., Huang, A.A., Dudley, A.M., and Fay, J.C. (2013). Genomic
Received: December 10, 2015
sequence diversity and population structure of Saccharomyces cerevisiae
Revised: January 30, 2016
assessed by RAD-seq. G3 (Bethesda) 3, 2163–2171.
Accepted: February 3, 2016
Published: March 24, 2016 20. Baird, N.A., Etter, P.D., Atwood, T.S., Currey, M.C., Shiver, A.L., Lewis,
Z.A., Selker, E.U., Cresko, W.A., and Johnson, E.A. (2008). Rapid SNP dis-
REFERENCES covery and genetic mapping using sequenced RAD markers. PLoS ONE 3,
e3376.
1. Cavalieri, D., McGovern, P.E., Hartl, D.L., Mortimer, R., and Polsinelli, M. 21. Gao, H., Williamson, S., and Bustamante, C.D. (2007). A Markov chain
(2003). Evidence for S. cerevisiae fermentation in ancient wine. J. Mol. Monte Carlo approach for joint inference of population structure and
Evol. 57 (Suppl 1 ), S226–S232. inbreeding rates from multilocus genotype data. Genetics 176, 1635–
2. McGovern, P.E. (2003). Ancient Wine: The Search for the Origins of 1651.
Viniculture (Princeton University Press). 22. Strope, P.K., Skelly, D.A., Kozmin, S.G., Mahadevan, G., Stone, E.A.,
3. Fay, J.C., and Benavides, J.A. (2005). Evidence for domesticated and wild Magwene, P.M., Dietrich, F.S., and McCusker, J.H. (2015). The 100-
populations of Saccharomyces cerevisiae. PLoS Genet. 1, 66–71. genomes strains, an S. cerevisiae resource that illuminates its natural
4. Legras, J.L., Merdinoglu, D., Cornuet, J.M., and Karst, F. (2007). Bread, phenotypic and genotypic variation and emergence as an opportunistic
beer and wine: Saccharomyces cerevisiae diversity reflects human his- pathogen. Genome Res. 25, 762–774.
tory. Mol. Ecol. 16, 2091–2102. 23. Almeida, P., Barbosa, R., Zalar, P., Imanishi, Y., Shimizu, K., Turchetti, B.,
5. Liti, G., Carter, D.M., Moses, A.M., Warringer, J., Parts, L., James, S.A., Legras, J.L., Serra, M., Dequin, S., Couloux, A., et al. (2015). A population
Davey, R.P., Roberts, I.N., Burt, A., Koufopanou, V., et al. (2009). genomics insight into the Mediterranean origins of wine yeast domestica-
Population genomics of domestic and wild yeasts. Nature 458, 337–341. tion. Mol. Ecol. 24, 5412–5427.
970 Current Biology 26, 965–971, April 4, 2016 ª2016 Elsevier Ltd All rights reserved
24. Pickrell, J.K., and Pritchard, J.K. (2012). Inference of population splits 29. Knight, S., Klaere, S., Fedrizzi, B., and Goddard, M.R. (2015). Regional mi-
and mixtures from genome-wide allele frequency data. PLoS Genet. 8, crobial signatures positively correlate with differential wine phenotypes:
e1002967. evidence for a microbial aspect to terroir. Sci. Rep. 5, 14233.
25. Wang, Q.M., Liu, W.Q., Liti, G., Wang, S.A., and Bai, F.Y. (2012).
30. Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with
Surprisingly diverged populations of Saccharomyces cerevisiae in natural
Burrows-Wheeler transform. Bioinformatics 25, 1754–1760.
environments remote from human activity. Mol. Ecol. 21, 5404–5417.
26. Reich, D., Thangaraj, K., Patterson, N., Price, A.L., and Singh, L. (2009). 31. DePristo, M.A., Banks, E., Poplin, R., Garimella, K.V., Maguire, J.R., Hartl,
Reconstructing Indian population history. Nature 461, 489–494. C., Philippakis, A.A., del Angel, G., Rivas, M.A., Hanna, M., et al. (2011). A
framework for variation discovery and genotyping using next-generation
27. Hyma, K.E., Saerens, S.M., Verstrepen, K.J., and Fay, J.C. (2011).
DNA sequencing data. Nat. Genet. 43, 491–498.
Divergence in wine characteristics produced by wild and domesticated
strains of Saccharomyces cerevisiae. FEMS Yeast Res. 11, 540–551. 32. McLaren, W., Pritchard, B., Rios, D., Chen, Y., Flicek, P., and
28. Bokulich, N.A., Thorngate, J.H., Richardson, P.M., and Mills, D.A. (2014). Cunningham, F. (2010). Deriving the consequences of genomic variants
Microbial biogeography of wine grapes is conditioned by cultivar, vintage, with the Ensembl API and SNP Effect Predictor. Bioinformatics 26,
and climate. Proc. Natl. Acad. Sci. USA 111, E139–E148. 2069–2070.
Current Biology 26, 965–971, April 4, 2016 ª2016 Elsevier Ltd All rights reserved 971