You are on page 1of 14

Convergent evolution in European and Rroma

populations reveals pressure exerted by plague


on Toll-like receptors
Hafid Laayounia,1, Marije Oostingb,c,1, Pierre Luisia, Mihai Ioanab,d, Santos Alonsoe, Isis Ricaño-Poncef, Gosia Trynkaf,2,
Alexandra Zhernakovaf, Theo S. Plantingab,c, Shih-Chin Chengb,c, Jos W. M. van der Meerb,c, Radu Poppg, Ajit Soodh,
B. K. Thelmai, Cisca Wijmengaf, Leo A. B. Joostenb,c, Jaume Bertranpetita,3, and Mihai G. Neteab,c,3,4
a
Institut de Biologia Evolutiva (Consejo Superior de Investigaciones Cientificas–Universitat Pompeu Fabra), Universitat Pompeu Fabra, 08003 Barcelona, Spain;
b
Department of Medicine and cNijmegen Institute for Infection, Inflammation and Immunity, Radboud University Nijmegen Medical Centre, 6525 GA,
Nijmegen, The Netherlands; dUniversity of Medicine and Pharmacy Craiova, 200349 Craiova, Romania; eDepartment of Genetics, Physical Anthropology and
Animal Physiology, University of the Basque Country, Barrio Sarriena s/n, 48940 Leioa, Spain; fDepartment of Genetics, University of Groningen/University
Medical Center Groningen, 9700 RB, Groningen, The Netherlands; gDepartment of Medical Genetics, “Iuliu Hatieganu” University of Medicine and Pharmacy,
400023 Cluj-Napoca, Romania; hDepartment of Gasteroenterology, Dayanand Medical College and Hospital, Ludhiana, Punjab 141001, India; and
i
Department of Genetics, University of Delhi South Campus, New Delhi 110 021, India

Edited* by Charles A. Dinarello, University of Colorado Denver, Aurora, CO, and approved January 2, 2014 (received for review September 19, 2013)

Recent historical periods in Europe have been characterized by infection in modern Europeans compared with Africans (6). All
severe epidemic events such as plague, smallpox, or influenza that these studies have investigated candidate genes selected on the
shaped the immune system of modern populations. This study basis of biological assumptions, but comprehensive genome-wide
aims to identify signals of convergent evolution of the immune approaches to identify the immune pathways under evolutionary

IMMUNOLOGY
system, based on the peculiar demographic history in which two pressure by infections are missing.
populations with different genetic ancestry, Europeans and Rroma In this study, we make use of the opportunity that a special
(Gypsies), have lived in the same geographic area and have been historical demographic situation is present in Europe—that is, an-
exposed to similar environments, including infections, during the
cient European populations living together with Rroma in the same
geographic locations. Rroma (traditionally called Gypsies) are a
last millennium. We identified several genes under evolutionary
population from Northwest India that has migrated in Europe one
pressure in European/Romanian and Rroma/Gipsy populations,
millennium ago (7). We hypothesized that despite their different
but not in a Northwest Indian population, the geographic origin of ethnic and genetic backgrounds, the strong infectious pressure
the Rroma. Genes in the immune system were highly represented exerted by the major epidemics of the last millennium (of which
among those under strong evolutionary pressures in Europeans, epidemics of plague are probably the most significant) has led
and infections are likely to have played an important role. For to convergent evolution: specific immune genes, selected during
example, Toll-like receptor 1 (TLR1)/TLR6/TLR10 gene cluster showed these European epidemics, become signatures that differ from
a strong signal of adaptive selection. Their gene products are func-
tional receptors for Yersinia pestis, the agent of plague, as shown Significance
by overexpression studies showing induction of proinflammatory
cytokines such as TNF, IL-1β, and IL-6 as one possible infection that
This article gives a unique perspective on the impact of evo-
may have exerted evolutionary pressures. Immunogenetic analysis
lution on the immune system under pressure by infections,
showed that TLR1, TLR6, and TLR10 single-nucleotide polymor-
using the special demographic history of Europe in which two
phisms modulate Y. pestis–induced cytokine responses. Other
populations with different genetic ancestry, Europeans and
infections may also have played an important role. Thus, recon-
Rroma (Gypsies), have lived in the same geographic area and
struction of evolutionary history of European populations has
have been exposed to similar environmental hazards, including
identified several immune pathways, among them TLR1/TLR6/TLR10,
infections. We identified convergent evolution signals in genes
as being shaped by convergent evolution in two human popula-
from different human populations. Reconstruction of evolu-
tions with different origins under the same infectious environment.
tionary history of European populations has identified Toll-like
receptor 1 (TLR1)/TLR6/TLR10 as a pattern recognition pathway
immunity | pattern recognition receptors | pandemics | migration shaped by convergent evolution by infections, among which
plague is a likely cause, influencing the survival of these pop-

B y recognition and elimination of pathogenic microorganisms


during infection, the immune system has allowed mankind to
survive. Genetic variation in the immune system is a major factor
ulations during the infection.

Author contributions: H.L., J.W.M.v.d.M., A.S., B.K.T., C.W., L.A.B.J., J.B., and M.G.N. de-
influencing susceptibility to infections. Subsequently, genes of signed research; H.L., M.O., P.L., M.I., S.A., I.R.-P., G.T., A.Z., T.S.P., S.-C.C., R.P., A.S., and
L.A.B.J. performed research; M.O., P.L., M.I., S.A., I.R.-P., G.T., A.Z., T.S.P., S.-C.C., and R.P.
the immune system are under constant evolutionary pressure (1), contributed new reagents/analytic tools; H.L., M.O., P.L., M.I., S.A., I.R.-P., G.T., A.Z., T.S.P.,
and this pressure can change based on local conditions and mi- S.-C.C., R.P., and M.G.N. analyzed data; and H.L., M.O., M.I., S.A., J.W.M.v.d.M., A.S., B.K.T.,
gration routes of human populations (2). C.W., L.A.B.J., J.B., and M.G.N. wrote the paper.
In time, changes induced in the immune system by infectious The authors declare no conflict of interest.
pressures can shape not only the host defense and susceptibility *This Direct Submission article had a prearranged editor.
to infections but also susceptibility to autoimmune or inflammatory 1
H.L. and M.O. contributed equally to this work.
diseases of modern human populations (2), with balancing se- 2
Present address: Division of Genetics, Department of Medicine, Brigham and Women’s
lection proposed as a main force shaping the innate immunity Hospital, Harvard Medical School, Boston, MA 02115; and Program in Medical and Pop-
reaction (3). It has been suggested that a predominantly proin- ulation Genetics, Broad Institute of Harvard and Massachusetts Institute of Technology,
flammatory profile in the immune system, induced by infections, Cambridge, MA 02142.
predisposes modern human populations to autoimmune diseases 3
J.B. and M.G.N. share senior authorship.
(4, 5), whereas selection of certain genetic variants during epi- 4
To whom correspondence should be addressed. E-mail: Mihai.Netea@radboudumc.nl.
demics [e.g., selection of C-C chemokine receptor type 5 (CCR5) This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
variants presumably by plague] reduces susceptibility to HIV 1073/pnas.1317723111/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1317723111 PNAS Early Edition | 1 of 6


those found in the Northwest Indian populations from whom the (PCA) implemented in eigensoft program (9) and plotted using
Rroma have derived (7). These signatures would enable us to multidimensional scaling (Fig. 1B). Individuals showing admix-
detect recent adaptations and could lead to the understanding of ture ancestry or false allocation were excluded from further
susceptibility to infections (and other immune-mediated diseases) analysis. A plot of the first versus the second eigenvectors (Fig.
in modern European populations. 1B) shows a clear differentiation of the Rroma cluster of indi-
viduals from the Romanian and the Indian populations. How-
Results ever, Rroma are very close to Indians across eigenvector 1, in
Populations. The population of Romania is comprised mainly of agreement with their evolutionary history. This indicates these
Indo-European populations, among which Romanian speakers population labels have a genetic basis and are not merely so-
represent 88% of the population, whereas 3.2% of inhabitants cial constructs.
are of Rroma ethnic background (www.recensamantromania.ro).
Evolutionary Analysis Identifies Innate Immune Pathways and TLR1/
After ethical approval by the Ethics Committee of the University
of Craiova, Romania, informed consent was obtained for all TLR6/TLR10 Among Genes Under Common Selection Pressure in
volunteers and DNA samples were collected from individuals Europeans/Romanians and Rroma. To identify signals of positive
of European/Romanian or Rroma ethnic background. A popu- selection shared between Europeans and Rroma but not present
lation of individuals of Northwestern Indian descent, represent- in the Indian population, we looked for shared signals of im-
ing the geographic origin of the Rroma group (Fig. 1A), was also portant genetic differentiation between these two populations
with the Indian population, accompanied by the absence of
recruited.
genetic differentiation between them. Two tests were used: (i)
We assayed 196,524 single-nucleotide polymorphisms (SNPs)
Cross-Population Composite Likelihood Ratio (XP-CLR) (10),
using the Illumina immunochip array (8) in all three populations. which is a test that aims to identify selective sweeps in a pop-
Analysis of genetic distance and principal component analysis be- ulation by detecting important genetic differentiation in an ex-
tween these populations based on nongenic, and thus presumably tended genomic region by including information about linkage
neutral, SNPs show clear differences between the three pop- disequilibrium without requiring haplotype information, and (ii)
ulations studied. Admixed individuals and erroneous self-assigned TreeSelect test (11), which is a tree-based method that incor-
ancestry was examined using principal components analysis porates allele frequency information from all populations ana-
lyzed to increase power to detect selection and distinguishes
which population has been under positive selection. A window
was considered to show an extreme score if its summary statistic
A (maximum in the case of XP-CLR, mean in case of TreeSelect
statistic) belonged to the 1% upper tail of the genome-wide
summary statistic distribution. Therefore, for XP-CLR, we were
interested in windows with the extreme 1% signal of population
differentiation both between Rroma and Indians and between
Europeans and Indians, as long as these windows did not belong
to the 5% extreme distribution for the Rroma versus European
comparison. For TreeSelect, we listed the windows belonging to
the 1% upper tail of the distribution for Rromas and Romanians
as long as they do not belong to the 5% upper tail of the dis-
tribution in Indians. Table 1 lists the genes contained in windows
that fulfill these criteria, along with other genes highly significant
in any of the tests in any of the three populations analyzed.
Manhattan plots for XP-CLR and TreeSelect statistics are shown
in Fig. 2 A and B, respectively, where the strong concordance
B between both tests can be seen.
We investigated the overrepresentation of categories of genes
detected to show similar selection signals in Rroma and Roma-
nians and not in Indians, using Protein Analysis Through Evo-
lutionary Relationships (PANTHER) (12) analysis. Table 2
shows the overrepresented molecular functions and biological
processes with the contributing genes. The Toll-like receptor
(TLR)/cytokine–mediated signaling pathway group, which com-
prises the genes TLR1, TLR6, and TLR10 (in the second cluster
of Table 1), appears at the top of groups overrepresented with
a P value = 0.00381.
The finding of the TLR2 gene cluster as under positive se-
lection is of great relevance in looking for convergent selection in
Rromas and Romanians. To overcome a possible lack of power
of detecting selection in Indians for this cluster, we sought de-
rived allele frequency (DAF) of SNPs that shows signals of
positive selection in this study. SNP rs4833103 has a DAF in
Rroma of 0.3, in Romanians 0.5, and in Indians 0.02. For SNP
imm_4_38475934, the DAF in Rroma is 0.05, in Romanians 0.04,
Fig. 1. Geographic origin of the three populations studied. (A) European/ and in Indians 0.007. This result suggests that the signals of
Romanians and Rroma/Gipsy share the same location, even if the origin of positive selection can be attributed only to Rroma and Roma-
the latter is in North India. (B) Plot of the populations under analysis nians. Moreover, population differentiation estimated by FST
according to the coordinates to the two main eigenvectors of smartpca statistic shows that most of the SNPs within this cluster have
(Eigensoft) analysis, in which each dot represents an individual. Individuals high differentiation between Rroma and Indians and between
within the circles and the same color have been considered for the study; Romanians and Indians but not between Rroma and Romanian.
those of different colors represent false population allocation and those The case of SNP rs4833103 is of special interest; this SNP shows an
intermediate represent admixed individuals. ROM, nongypsy Romanians; FST between Rroma and Indians of 0.49, between Romanians and
INDI, individuals from North India; GYP, Rroma/Gypsies living in Romania. Indians 0.69, and between Rroma and Romanians 0.04 (Fig. S1),

2 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1317723111 Laayouni et al.


Table 1. Genes with extreme values of XP-CLR statistic and TreeSelect test, indicative of
putative signals of positive selection
Genes Chromosome Test Populations

SLC45A2, ADAMTS12, AMACR, RXFP3 chr5 XP-CLR Rroma and Romanians vs. Indians
TreeSelect All
TLR1, TLR6, TLR10, FAM114A1 chr4 XP-CLR Rroma and Romanians vs. Indians
TreeSelect Rroma and Romanians
FBXL19, SETD1A, STX1B, STX chr16 TreeSelect Indians
BTNL2, HLA-DRA chr6 TreeSelect Rroma and Romanians
ANK3 chr1 TreeSelect Rroma and Romanians
BAZ1A, SRP54 chr14 XP-CLR Romanians vs. Indians
KCNK10 chr14 XP-CLR Rroma vs. Indians
NEK7 chr1 XP-CLR Rroma vs. Romanians
Ataxin2 chr12 XP-CLR Romanians vs. Indians

Genes that appear in the same row belong to the same chromosomal regions and are in a linkage
disequilibrium block. Using XP-CLR statistic, the interest is in genes with signals in Romanian compared with
Indians, and in Rroma compared with Indians, but not present in Romanian compared with Rroma; for
TreeSelect, the interests are signals in Rroma and Romanians but not Indians. We report other cases even if they
do not fulfill the above criteria and are not of direct interest to this study.

values of undoutable value for the present framework. Notably, this Interestingly, the other gene cluster detected (first row in
SNP (intergenic between TLR1 and TLR6) was reported to be Table 1), with four genes in chromosome 5, contains the well-
known gene SLC45A2, described as being under positive selec-

IMMUNOLOGY
associated with an expression quantitative trait loci of the expres-
sion of three genes, TLR1, TLR6, and TLR10, in lymphoblastoid tion in relation to skin pigmentation in Europeans (14). Other
cell lines (LCLs) (13). strong signals are for the BTNL2 gene locus in chromosome 6
We also performed an additional analysis using genotype data coming from the TreeSelect test in Rroma and Romanian pop-
from the Illumina Omni 2.5M Chip for the 1000 Genome Project ulations. This gene is highly polymorphic, with homology to the
butyrophilin gene family, and is located at the border of the
for individuals (14) in an Indian (Gujarati) and European (North- major histocompatibility complex (MHC) class I and class II
ern Europeans from Utah, CEU) population. XP-CLR statistic regions in humans. This signal of positive selection may be due to
was used to detect selection in this Indian population. Results the role of MHC in adaptation to pathogens in human history.
show that there is clear signal of selection in the European Many other strong signals are shown in Fig. 2 A and B, however
population (CEU) compared with the Indian (Gujarati) pop- these signals are specific to one single population or show differ-
ulation, but no signal of selection was detected in this In- entiation between Rroma and Romanians and cannot be caused
dian population compared with the European population (Fig. by a convergent adaptation of the same evolutionary process in
S2 A and B). these two populations.
Most of the signals found in this study cluster in regions of the
genome with a high linkage disequilibrium (Fig. S3 A–C for TLR
group, cluster containing SLC45A2 gene and cluster containing
the BTLN2 gene). This finding makes it difficult to pinpoint the
A exact target of selection in each case, a general problem of se-
SLC45A2, ADAMTS12

lection studies (15). Clearly, genes in the TLR1/6/10 cluster are


of special interest for the present study.
TLR1, TLR6, TLR10

BTNL2 (MHC II-III)

TLR2 Cluster Genes Are Involved in the Recognition of Yersinia pestis.


TLR2 recognition of V-antigen and LcrV of Y. pestis is the main
ANK3

recognition mechanism during plague. TLR2 forms heterodimers


with receptors of the same gene cluster (TLR1/TLR6) for recog-
nition of bacterial lipopeptides (16), but it is not known whether
TLR2 also collaborates with TLR10 for the recognition of Y. pestis.
We transfected HEK cells (that normally express TLR1 and
TLR6) with TLR2, TLR10, or TLR2 and TLR10. The HEK cells
transfected with TLR2 alone release significantly more cytokines
than untransfected cells: twofold more for Y. pestis and fivefold
more for Yersinia pseudotuberculosis, the microorganisms from
which Y. pestis evolved (Fig. 3A). Although TLR10 by itself is not
able to induce cytokine production, cotransfection of TLR10
Ataxin2
TLR1, TLR6, TLR10

with TLR2 completely abrogates the stimulatory effect of TLR2


KCNK10
NEK7

BAZ1A, SRP54

(Fig. 3A). These data were supported by blocking TLR2 in mon-


AMACR, RXFP3
SLC45A2, ADAMTS12 ,

ocytes using monoclonal antibodies (Fig. 3 B–D). Interestingly,


blocking TLR10 resulted in an increase in cytokine production
(Fig. 3 B–D), supporting the observation that TLR10 has a mod-
B ulatory effect, thus corroborating the overexpression experiments.
The modulatory effects of TLR10 seem to be exerted specifically
on TLR2 signaling, as anti-TLR10 antibodies modulated cytokine
production induced by palmitoyl-3-cysteine-serine-lysine-4, but not
Fig. 2. Manhattan plot of results of selection tests in Rroma, Romanians, by the TLR4 agonist LPS (Fig. S4). Moreover, when cells of
and Indians using TreeSelect statistic (A) and XP-CLR statistic (B). Chromo- individuals carrying the SNP in TLR10 were exposed to either
somes ordered from chromosome 1 to chromosome 22. LPS, Poly I:C, CpG, or flagellin, no differences between the

Laayouni et al. PNAS Early Edition | 3 of 6


Table 2. Statistical overrepresentation test of PANTHER analysis
No. genes in the No. genes in the Expected no. genes
Groups database dataset in the dataset P value

By biological process
Cytokine-mediated signaling pathway 184 3 0.31 0.0028
Visual perception 209 3 0.35 0.0040
Neurological system process 830 5 1.38 0.0062
System process 920 5 1.53 0.0097
Sensory perception 326 3 0.54 0.0139
Immune system process 1,036 5 1.72 0.0162
Signal transduction 1,642 6 2.73 0.0266
Cell communication 1,730 6 2.87 0.0344
Cell surface receptor linked signal transduction 846 4 1.41 0.0387
By molecular function
Racemase and epimerase activity 14 1 0.02 0.0230
Receptor activity 779 4 1.29 0.0294
Transporter activity 24 1 0.04 0.0392

Biological process and molecular function enrichment for genes showing signals of selection in Rroma and Romanians.

groups could be detected (Fig. S5). Interestingly, however, Discussion


cross-linking of TLR10 receptors inhibited the IL-6 induction In this study, we identified a set of genes evolving under positive
by IL-1 (Fig. S6), suggesting that TLR10 may exert inhibitory selection in populations of different ethnic ancestry living in
effects on the IL-1 family of cytokines (17). Europe, but not in Northwest India. Among these genes, the
region encompassing TLR1, TLR6, and TLR10 is under selection
Common TLR1, TLR6, and TLR10 Polymorphisms in European Populations in Europeans/Romanians and Rroma/Gypsies, but not in a popu-
Modulate Cytokine Responses to Y. pestis. To demonstrate that TLR1, lation from Northwest India. The common selection pressures in
TLR6, and TLR10 genetic variation in the population modulates the Romanians and Rroma may be interpreted as the same evo-
the response to Y. pestis, we isolated peripheral blood mono- lutionary process induced by local infectious conditions in two
nuclear cells (PBMCs) from a group of 101 individuals of Eu- European populations of different genetic backgrounds. To look
ropean descent and exposed them to the pathogen. SNPs in for more evidence on positive selection in European populations,
TLR1, TLR6, and TLR10 significantly influenced cytokine pro- we analyzed sequence data from the 1000 Genome Project (18).
duction induced by Y. pestis and Y. pseudotuberculosis (Fig.4 and These data show a clear selective sweep in Europeans using two
Fig. S7). In contrast, known polymorphisms in TLR4 (Asp299Gly methods based on genetic differentiation and extended linkage
and Thre399Ile) did not influence the response of PBMCs to Y. disequilibrium haplotype [cross-population extended haplotype
pestis or Y. pseudotuberculosis (Fig. S8). homozogysity (XP-EHH) and XP-CLR]. This signal was specific

Fig. 3. The role of TLR10 for the recognition of


Y. pestis and Y. pseudotuberculosis. (A) HEK293
transiently transfected with TLR2, TLR10, or
TLR2/10, and stimulated with 1 × 105 heat-inac-
tivated Y. pestis or Y. pseudotuberculosis, re-
spectively. Bars represent the means ± SEM of at
least three separate experiments. (B) PBMCs
stimulated with Y. pestis or Y. pseudotuberculo-
sis per mL. n = 6; means ± SEM; *P = 0.05, **P =
0.01. (C) TNF-α production after PBMCs stimu-
lated with Y. pestis or Y. pseudotuberculosis in
the presence or absence of 10 μg/mL antibody.
(D) IL-1β production after 24 h of stimulation.
Means ± SEM; *P = 0.05, **P = 0.01. The data
shown are from three independent experiments
each performed in duplicate.

4 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1317723111 Laayouni et al.


evolutionary events acting on the immune system of populations
living in Europe.
An important question is which evolutionary pressures were
common to the Romanian and Rroma populations. Infections
are likely to have been one of the most important evolutionary
forces shaping the immune system in both Europe and India, and
several candidates may be considered. An infection often associated
with evolutionary effects in Europeans is plague, responsible for
several large epidemics with death rates of up to 30–50% of the
European population and lingering thereafter in Europe for
several centuries (22), thus allowing for the exertion of selective
sweeps. Based on this extreme burden of mortality, it is rational
to hypothesize that plague had major evolutionary effects on the
immune system of European populations. The TLR/IL-1 func-
tional cluster is crucial for host defense against Y. pestis: TLR2
and its coreceptors TLR1, TLR6, and TLR10 are the main pattern
recognition receptors for Y. pestis—all localized in a single gene
cluster in chromosome 4 (23), whereas Y. pestis Caf1 protein is an
inhibitor of IL-1β (24). Decreased IL-1 responses, either through
defective TLR signaling or release of Caf1, are likely to have
deleterious effects on host survival. The data presented here show
that the TLR1/TLR6/TLR10 receptor cluster has been under posi-
tive selection in both Romanians and Rroma, and suggest that
plague is a potential infection that has exerted this selection.
Our data are also supported by an earlier study that identified

IMMUNOLOGY
the TLR1/TLR6/TLR10 gene cluster as a target of recent positive
selection in non-Africans (25). We confirmed the functional im-
pact of TLR1, TLR6, and TLR10 polymorphisms currently present
in Europeans for the immune responses to Y. pestis.
Although evolutionary pressure exerted by plague is a plausi-
ble cause of adaptive selection, it should be emphasized that
other infections in which the receptors of the TLR2 cluster play
a central role, such as tuberculosis, leprosy, or common Gram-
positive pathogens, could have also contributed to the genetic
pattern observed here. Nevertheless, these infections have a
generally less restricted geographical pattern as common in India
as in Europe. Importantly, the impact of historical plagues in
India has been a matter of debate. Out of the three main out-
Fig. 4. Functional consequences of human TLR1/TLR6/TLR10 SNPs for Y. pestis–
breaks of plague (6–7th centuries, 14th century, and turn of 19–
stimulated cytokine production. PBMCs from healthy volunteers stimulated
20th century), by far the most devastating is the second, called
with different stimuli, including Y. pestis (1 × 105/mL). Volunteers were the Black Death. This outbreak is known not to have affected
separated into three groups: one group did not display the SNP in either India (26) and took place after the settlement of Rroma in
TLR1 (A/B), TLR6 (C/D), or TLR10 (E/F; wt, wild-type); one group was het- Europe. Indeed, the Indian subcontinent may have been the only
erozygous for the polymorphism (He); and one group was homozygous (Ho). part of Eurasia to have experienced steady population growth
Data are means ± SEM. *P = 0.05, **P = 0.01, ***P = 0.001. during the last half of the 14th century, and the first reports of
plague are from the 17th century, with much less impact than the
Black Death. During the epidemics in the Indian subcontinent,
in Europeans and absent in an African population (Yoruba) the disease behaved differentially than plague in the 14th century
and in a Chinese population (Fig. S9). in Europe, with less than 5% human mortality. It is likely that the
Besides the TLR2 gene cluster, other genes of interest include absence of the flea Xenopsylla cheopis due to tropical environ-
(i) a gene cluster with four genes in chromosome 5 that contains ment and the distance and geographical barriers could have
the well-known gene SLC45A2 being under positive selection in prevented the entrance of the devastating outbreak of the Mid-
relation to skin pigmentation; (ii) FBXL19, a gene known to be dle Ages into India (26).
involved in the modulation of inflammation (19) in a cluster The identification of the immune pathways and genetic var-
comprising three genes; and (iii) ADAMTS12 gene, which is iants that were specifically selected in Europe not only helps us
associated with susceptibility to autoimmune diseases (20). In the to understand the evolutionary history of European populations,
same cluster as the SLC45A2 gene, other genes (Table 1) may be but also contributes to our understanding of the differences
in susceptibility between European and other populations to
of special interest to be analyzed functionally in the future.
modern human diseases. Evolutionary pressure exerted by pla-
Linguistic and genetic studies suggested that the Rroma
gue or smallpox has been previously proposed to partly explain
population left India in the 5–10th centuries and started to settle the increased resistance to HIV in Europeans (6). In addition,
in Europe during the 11th century (21). Genetic studies, focused the evolution toward a proinflammatory profile induced by
on uniparental and Mendelian disease markers, confirmed Rroma infections during history might explain the burden of autoim-
as an isolated population of Indian origin among the European mune diseases in modern human populations (27). Genetic
majority (7). We pose that after the Rroma migration, the in- variation in TLR7 and TLR8 has been shown to protect against
fectious pressures to which the Rroma were exposed were the viral infections (25), while predisposing some to autoimmune
same as for the Europeans, whereas for the ancestral North diseases (4). Similarly, TLR1 or TLR10 polymorphisms can
Indian population, they remain linked to their geographical lo- protect against infections, while being associated with auto-
cation in India. This peculiar demographic situation in Europe, inflammatory diseases such as sarcoidosis (28) and Crohn’s disease
in which populations with different genetic backgrounds have (29). Although the differences in cytokine production induced
been exposed for a long period to similar infection pressures, by Y. pestis in individuals with various TLR1, TLR6, or TLR10
gave us the opportunity to attempt the reconstruction of recent polymorphisms are moderate from an immunological point of

Laayouni et al. PNAS Early Edition | 5 of 6


view, they are large from an evolutionary perspective, and can as the best molecular pattern to study very recent events of positive selection
lead in the long term to significant shifts in the population. It after haplotype structure (30). However, the design of the immunochip
should be realized that we may not have detected other genes with very variable SNP density across the genome does not allow us to properly
relevant for host defense that may be under selective pressure, study the haplotype structure (for phasing issues and haplotype informativeness
as they have not been included in the Illumina immunochip array, differences among regions with different SNP density). We used tests that are
and only future studies using genome-wide sequencing have the amenable to SNP data (and thus with ascertainment bias). For an ex-
capacity to provide an exhaustive analysis of the entire genome. tensive description of the XP-CLR and TreeSelect tests, please consult
In conclusion, by comparing genes under selection in European/ SI Methods.
Romanian and Rroma/Gipsy populations, we identified several
immunological pathways specifically shaped by evolutionary pro- TLR Cloning and Transfection. TLR cloning and transfection of human em-
cesses in populations living together in Europe during the last bryonic kidney 293 cells that were stably transfected with hTLR2 (293-hTLR2;
millennium. It is likely that the selection pressure at least on kindly provided by Dr. D. T. Golenbock, University of Massachusetts Medical
some of these genes has been exerted by plague epidemics, and Center, Worcester, MA) are described in detail in SI Methods.
we identify the TLR1/TLR6/TLR10 pattern recognition system
as a likely candidate. Cytokine Stimulation. PBMCs were isolated after obtaining informed consent
(31). PBMCs (5 × 105) in 100 μL volume were added to round-bottom 96-well
Methods plates (Greiner) and incubated with stimuli for 24 h at 37 °C and 5% CO2.
Populations. After informed consent was obtained, blood was collected from Cytokines were measured using specific sandwich ELISA kits for IL-1β and
100 individuals of European/Romanian descent and 100 individuals of TNF-α (R&D Systems). IL-6, IL-8, and IL-10 were measured using PeliKine
a Rroma/Gipsy ethnic background. A population of 500 individuals of North Compact ELISA kits (Sanquin).
Indian descent, representing the geographic origin of the Rroma/Gipsy
group, was also recruited. Healthy Dutch individuals were recruited for cy- Immunogenetic Studies. DNA was isolated from whole blood using the Gentra
tokine stimulations (21–73 y old, 73% males and 27% females). Pure Gene Blood kit (Qiagen), and genotype assessments of the TLR10-
N241H, TLR1-N248S, and TLR6-S249P SNPs were performed using a prede-
Immunochip Arrays and Analysis of Genetic Distances Between Populations. signed TaqMan SNP genotyping assay (Applied Biosystems). The software
Samples were genotyped on immunochip custom array at the Department automatically plotted genotypes based on a two-parameter plot with an
of Genetics, University Medical Center Groningen, The Netherlands (8). To overall success rate of >95%. Cycling conditions were 2 min at 50 °C and 10
explore genetic relationships among the populations, we used PCA as min at 95 °C, followed by 40 cycles of 95 °C for 15 s and 1 min at 60 °C.
implemented in the Eigensoft package (9). For a detailed description of the Fluorescence intensities were corrected using a postread/preread method for
methods, see SI Methods. 1 min at 60 °C before and after the amplification.

Evolutionary Models. A selective sweep induces a fast spread of the beneficial ACKNOWLEDGMENTS. We thank Dr. Vandana Midha for recruitment of the
allele through the population until it reaches fixation. Through hitchhiking, Indian study cohort. We also thank the National Institute of Bioinformatics
the selected allele carries with it neutral alleles in the linked genomic region. (www.inab.org) for computational support. M.G.N. and C.W. were sup-
Thus, in comparison with the neutral expectation, one expects to observe ported by Vici grants of the Netherlands Organization of Scientific Research.
within a region that has evolved recently under positive selection a dramatic This work was funded by Grant BFU2010-19443 (to J.B.) from the Ministerio
pattern of genetic differentiation among populations within an extended de Ciencia y Tecnología (Spain) and the Direccío General de Recerca, Gen-
eralitat de Catalunya (Grup de Recerca Consolidat 2009 SGR 1101). P.L. was
genomic region. Taking advantage of these theoretical expectations, we
supported by a PhD fellowship from “Acción Estratégica de Salud, en el
applied two methodologies, XP-CLR (10) and TreeSelect (11) tests, to identify Marco del Plan Nacional de Investigación Científica, Desarrollo e Innovación
the genomic region under putative selection in European/Romanian and the Tecnológica 2008–2011” from Instituto de Salud Carlos III. B.K.T. was supported
Rroma/Gipsy populations, but not in the population from North India. We by Grant BT/01/COE/07/UDSC from the Department of Biotechnology, Government
focused our study on population differentiation because it has been described of India, New Delhi.

1. Barreiro LB, Quintana-Murci L (2010) From evolutionary genetics to human immu- 15. Akey JM (2009) Constructing genomic maps of positive selection in humans: Where
nology: How selection shapes host defence genes. Nat Rev Genet 11(1):17–30. do we go from here? Genome Res 19(5):711–722.
2. Netea MG, Wijmenga C, O’Neill LA (2012) Genetic variation in Toll-like receptors and 16. Akira S, Uematsu S, Takeuchi O (2006) Pathogen recognition and innate immunity.
disease susceptibility. Nat Immunol 13(6):535–542. Cell 124(4):783–801.
3. Ferrer-Admetlla A, et al. (2008) Balancing selection is the main force shaping the 17. Mantovani A, Locati M, Polentarutti N, Vecchi A, Garlanda C (2004) Extracellular and
evolution of innate immunity genes. J Immunol 181(2):1315–1322. intracellular decoys in the tuning of inflammatory cytokines and Toll-like receptors:
4. Stene LC, et al. (2006) Rotavirus infection frequency and risk of celiac disease auto- The new entry TIR8/SIGIRR. J Leukoc Biol 75(5):738–742.
immunity in early childhood: A longitudinal study. Am J Gastroenterol 101(10):2333–2340. 18. Abecasis GR, et al.; 1000 Genomes Project Consortium (2012) An integrated map of
5. Zhernakova A, et al.; Finnish Celiac Disease Study Group (2010) Evolutionary and genetic variation from 1,092 human genomes. Nature 491(7422):56–65.
functional analysis of celiac risk loci reveals SH2B3 as a protective factor against 19. Zhao J, et al. (2012) F-box protein FBXL19-mediated ubiquitination and degradation
bacterial infection. Am J Hum Genet 86(6):970–977. of the receptor for IL-33 limits pulmonary inflammation. Nat Immunol 13(7):651–658.
6. Stephens JC, et al. (1998) Dating the origin of the CCR5-Delta32 AIDS-resistance allele 20. Nah SS, et al. (2012) Association of ADAMTS12 polymorphisms with rheumatoid ar-
by the coalescence of haplotypes. Am J Hum Genet 62(6):1507–1515. thritis. Mol Med Rep 6(1):227–231.
7. Mendizabal I, et al. (2012) Reconstructing the population history of European Romani 21. Fraser A (1992) The Gypsies (Blackwell, Oxford).
from genome-wide data. Curr Biol 22(24):2342–2349. 22. McEvedy C (1988) The bubonic plague. Sci Am 258(2):118–123.
8. Trynka G, et al.; Spanish Consortium on the Genetics of Coeliac Disease (CEGEC); 23. Takeuchi O, et al. (2002) Cutting edge: Role of Toll-like receptor 1 in mediating im-
PreventCD Study Group; Wellcome Trust Case Control Consortium (WTCCC) (2011) mune response to microbial lipoproteins. J Immunol 169(1):10–14.
Dense genotyping identifies and localizes multiple common and rare variant associ- 24. Abramov VM, et al. (2001) Structural and functional similarity between Yersinia pestis
ation signals in celiac disease. Nat Genet 43(12):1193–1201. capsular protein Caf1 and human interleukin-1 beta. Biochemistry 40(20):6076–6084.
9. Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS 25. Barreiro LB, et al. (2009) Evolutionary dynamics of human Toll-like receptors and their
Genet 2(12):e190. different contributions to host defense. PLoS Genet 5(7):e1000562.
10. Chen H, Patterson N, Reich D (2010) Population differentiation as a test for selective 26. Sussman GD (2011) Was the black death in India and China? Bull Hist Med 85(3):
sweeps. Genome Res 20(3):393–402. 319–355.
11. Bhatia G, et al. (2011) Genome-wide comparison of African-ancestry populations from CARe 27. Di Rienzo A (2006) Population genetics models of common diseases. Curr Opin Genet
and other cohorts reveals signals of natural selection. Am J Hum Genet 89(3):368–381. Dev 16(6):630–636.
12. Mi H, Muruganujan A, Thomas PD (2013) PANTHER in 2013: Modeling the evolution 28. Veltkamp M, van Moorsel CH, Rijkers GT, Ruven HJ, Grutters JC (2012) Genetic vari-
of gene function, and other gene attributes, in the context of phylogenetic trees. ation in the Toll-like receptor gene cluster (TLR10-TLR1-TLR6) influences disease
Nucleic Acids Res 41(Database issue):D377–D386. course in sarcoidosis. Tissue Antigens 79(1):25–32.
13. Grundberg E, et al.; Multiple Tissue Human Expression Resource (MuTHER) Consor- 29. Abad C, et al. (2011) Association of Toll-like receptor 10 and susceptibility to Crohn’s
tium (2012) Mapping cis- and trans-regulatory effects across multiple tissues in twins. disease independent of NOD2. Genes Immun 12(8):635–642.
Nat Genet 44(10):1084–1089. 30. Sabeti PC, et al.; International HapMap Consortium (2007) Genome-wide detection and
14. Lao O, de Gruijter JM, van Duijn K, Navarro A, Kayser M (2007) Signatures of positive characterization of positive selection in human populations. Nature 449(7164):913–918.
selection in genes associated with human skin pigmentation as revealed from anal- 31. Oosting M, et al. (2011) TLR1/TLR2 heterodimers play an important role in the rec-
yses of single nucleotide polymorphisms. Ann Hum Genet 71(Pt 3):354–369. ognition of Borrelia spirochetes. PLoS ONE 6(10):e25998.

6 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1317723111 Laayouni et al.


Supporting Information
Laayouni et al. 10.1073/pnas.1317723111
SI Methods classical genetic differentiation indexes. Using TreeSelect soft-
Immunochip Arrays and Analysis of Genetic Distances Between ware (www.hsph.harvard.edu/alkes-price/software/), we performed
Populations. Genotype calling was performed using the Geno- the TreeSelect test on each of the 95,224 SNPs with a MAF above
typing Module (v1.8.4) of the GenomeStudio Data Analysis 0.01 in the three populations together. Thus, we obtained for each
Software package; subsequently over 150,000 variants were man- SNP in each population a score following a χ2(1) distribution.
ually inspected, and clusters were adjusted if needed. A cluster set From this score, we could obtain a formal P value to test the
based on 172,242 autosomal or X-chromosome variants was ap- population at each particular SNP.
plied to all samples (1). National Center for Biotechnology In- To screen the genome for positive selection in each population,
formation assembly hg18 was used to map the genome (Illumina we used a sliding window approach (window size, 50 Kb; offset, 20
manifest file Immuno_BeadChip_11419691_B.bpm). Potential Kb). The maximum score and the average within a window were
ethnic outliers were identified and excluded from analysis by used as a summary statistic for XP-CLR and TreeSelect statistic,
multidimensional scaling plots of samples merged with HapMap3 respectively. Then, using XP-CLR scores to identify signals of
data. Poor performing samples (call rate < 0.98), individuals with selection shared between Europeans/Romanians and Rromas/
discordant sex, duplicates, and first- or second-degree relatives Gypsies but not present in the Indian population, we identified
were removed. the windows belonging to the 1% upper tail of the genome-wide
After filtering out single-nucleotide polymorphism (SNPs) not distribution in both the European/Romanian versus Indian and
genotyped in more than 1% of the individuals, coding SNPs and Rroma/Gypsy versus Indian comparisons but not in the Rroma/
SNPs on the sexual chromosomes, on the one hand, plus high- Gypsy versus European/Romanian comparison. With the same
kinship, outlier, and admixed individuals, on the other hand, and purpose, we identified windows belonging to the 1% upper tail of
after removing those SNPs in linkage disequilibrium using Plink the TreeSelect test genome-wide distribution in both European/
(2) 1.0724, we were left with 60,371 SNPs and 642 individuals. Romanian and Rroma/Gypsy populations but not in the Indian one.
The statistical analysis showed that only the first two eigenvectors,
which explain 12.6% and 5.5% of the total variance, respectively, TLR Cloning and Transfection. Cells were cultured in 293–Toll-like
were significant. receptor 2 (TLR2) medium—DMEM (Invitrogen) supplemented
with 7.5% FBS (HyClone), 100 U/mL Penicillin–100 μg Strep-
Evolutionary Models. The Cross-Population Composite Likelihood tomycin (Invitrogen), and 1 mg/mL G418 (Sigma Chemical Co.).
Ratio (XP-CLR) test [10] identifies selective sweeps in a pop- After overnight incubation at 37 °C on Fast-Media Blast agar plates
ulation by detecting important genetic differentiation in an allele (Invivogen), a single colony of GT110 Escherichia coli containing
frequency differentiation. This method is robust to ascertainment
the pUNO-hTLR10 vector (Invivogen) was transferred to 3 mL
bias (and thus may be used in SNP data) and does not require any
Fast-Media Blast–TB medium (Invivogen) and incubated over-
haplotype information, thus avoiding errors of haplotype esti-
night at 37 °C, and 225 L/min QIAGEN Plasmid Mini isola-
mation from genotype data. XP-CLR scores were computed at
regularly spaced grid points (every 5 kb) using the information tion was performed according to the manufacturer’s instructions
from SNPs within a flanking window of 0.5 cM and with a minor (Qiagen Germany). DNA was stored at –20 °C until further use.
allele frequency (MAF) above 0.01 in the set of individuals The vector was checked by sequence analysis.
considered. To account for different SNP densities among ge- For transient transfections, 293-hTLR2 cells were cultured
nomic regions, we restricted to 200 the maximal number of SNPs overnight in six-well plates (1 × 106 cells per well) and transfected
used to calculate a XP-CLR score within a window, by randomly with the hTLR10 encoding plasmid (bearing a blasticidin re-
removing SNPs in excess. The Indian population was used as the sistance gene) using Fugene6 transfection reagent according
reference population for the Rroma/Gypsies and the European to the manufacturer’s protocol (Roche). Briefly, 3 μL Fugene6
Romanians, and the European Romanian population was used transfection reagent was added to 97 μL DMEM (Gibco), in-
for the Indians. The TreeSelect [11] test is a method that aims at cubated for 5 min at room temperature, and followed by the
detecting unusual allele frequency differentiation among multiple addition of 1 μg of TLR10 plasmid DNA and incubation of 15
populations using an unrooted tree that represents each pop- min at room temperature. Subsequently, the complexes were
ulation as a node. Thus, the tree describes the population di- added drop-wise to the 80% confluent monolayers of 293-hTLR2
vergence without knowing the order of divergence events in time. cells. After incubation for 24 h at 37 °C and 95% humidity, cells
Using the reconstructed tree, the method contrasts whether any were cultured for 8 wk in 293-TLR2 medium supplemented with
observed population is differentiated from the putative ancestral 5 μg/mL blasticidin by passaging the cells at 80% confluence before
population. Thus, one can assess which populations have been use in the experiments. hTLR10 expression was confirmed using
under positive selection, which adds a valuable advantage over FACS analysis.

1. Trynka G, et al.; Spanish Consortium on the Genetics of Coeliac Disease (CEGEC); 2. Oosting M, et al. (2011) TLR1/TLR2 heterodimers play an important role in the
PreventCD Study Group; Wellcome Trust Case Control Consortium (WTCCC) (2011) recognition of Borrelia spirochetes. PLoS ONE 6(10):e25998.
Dense genotyping identifies and localizes multiple common and rare variant
association signals in celiac disease. Nat Genet 43(12):1193–1201.

Laayouni et al. www.pnas.org/cgi/content/short/1317723111 1 of 8


Fig. S1. Manhattan plot of results of FST tests in Rroma vs. Indians, Romanians vs. Indians, and Rroma vs. Romanians. Chromosomes are ordered from chromosome
1 to chromosome 22. We highlight peaks in TLR1, TLR6, and TLR10 (chromosome 4). Note the high overlapping between FST analysis and TreeSelect statistic.

Fig. S2. CLR statistic for 400 kb upstream and 400 kb downstream of the TLR2 gene cluster. SNPs were genotyped with Illumina Omni 2.5 M Chip. Scores
higher than 10 are indicative of positive selection. No signal of selection is observed in the TLR2 gene cluster for the Indian population (Gujarati) compared
with the CEU (Northern Europeans from Utah) population (A), however a clear signal of selection is detected in the European population (B).

Laayouni et al. www.pnas.org/cgi/content/short/1317723111 2 of 8


Fig. S3. Hapmap linkage disequilibrium of a region containing TLR1, TLR6, TLR10, and FAM114A1 in chromosome 4 (A); cluster containing SLC45A2, ADAMTS12,
AMACR, and RXFP3 genes in chromosome 5 (B); and cluster containing genes BTNL2 and HLA in chromosome 6 (C). The linkage disequilibrium map is obtained
from Hapmap phase III from the CEU European population.

Laayouni et al. www.pnas.org/cgi/content/short/1317723111 3 of 8


Fig. S4. 5 × 105 peripheral blood mononuclear cells (PBMCs) from 15 individuals were preincubated for 1 h at 37 °C with either 10 μg/mL aTLR10 antibody
(aTLR10, clone 3C10C5) or 10 μg/mL IgG isotype control (IgG). After preincubation, cells were stimulated for 24 h with live Borrelia burgdorferi (B.b), TLR2
ligand palmitoyl-3-cysteine-serine-lysine-4 (10 μg/mL), or TLR4 ligand LPS (10 ng/mL). After stimulation, cell supernatants were collected and proinflammatory
cytokine IL-6 was measured using ELISA. Bars represent the mean and the SEM. Paired t test; *P < 0.05, **P < 0.01.

Fig. S5. 5 × 105 PBMCs were stimulated with either RPMI, 10 ng/mL LPS, 50 μg/mL Poly I:C, 5 μg/mL CpG, or 100 ng/mL flagellin for 24 h. After stimulation, IL-6
production was measured in the supernatants using ELISA. Wt, wild type for the N241H SNP in the TLR10 gene; He, heterozygous mutation; Ho, homozygous
mutation. Bars represent the mean ± SEM.

Laayouni et al. www.pnas.org/cgi/content/short/1317723111 4 of 8


Fig. S6. TLR10 may exert inhibitory effects on the IL-1 family of cytokines. We coated 10 μg/mL anti-TLR10 antibody or 10 μg/mL IgG isotype control in 96-well
flat-bottom plates and incubated them for 2 h at 37 °C. After washing and blocking, 5 × 105 PBMCs from 10 donors were added and incubated 1 h at 37 °C.
Thereafter, cells were stimulated at 37 °C for 24 h with either RPMI as negative control or increasing concentrations of IL-1β (1 ng/mL). IL-6 production was
measured in the supernatant using ELISA. Bars represent the mean ± SEM.

Laayouni et al. www.pnas.org/cgi/content/short/1317723111 5 of 8


Fig. S7. Functional consequences of human TLR1/TLR6/TLR10 SNPs for Yersinia pseudotuberculosis–induced cytokine production. PBMCs from healthy vol-
unteers were stimulated for 24 h with different stimuli, including Y. pseudotuberculosis (1 × 105/mL). After stimulation, supernatants were collected, and
cytokine levels were measured by ELISA. The TLR1, TLR6, and TLR10 status of these individuals was determined before PBMC stimulation, and they were
separated into three groups: one group did not display the SNP in either TLR1 (A/B), TLR6 (C/D), or TLR10 (E/F; wt, wild type), one group was heterozygous for
the polymorphism (He), and one group was homozygous (Ho). Data are means ± SEM. *P = 0.05, **P = 0.01.

Laayouni et al. www.pnas.org/cgi/content/short/1317723111 6 of 8


Fig. S8. Functional consequences of human TLR4 SNPs for Yersinia pestis and Y. pseudotuberculosis–induced cytokine production. PBMCs from healthy
volunteers were stimulated for 24 h with either Y. pestis or Y. pseudotuberculosis (1 × 105/mL). After stimulation, supernatants were collected, and cytokine
levels were measured by ELISA. The TLR4 status of these individuals was determined before PBMC stimulation, and they were separated into two groups: one
group did not display the SNP in either TLR4 (wt, wild type), and one group was heterozygous for the polymorphism (He). Data are means ± SEM. *P = 0.05,
Mann–Whitney U test. IL-1β, interleukin 1β; TNF-α, tumor necrosis factor-α; IL-10, interleukin 10; IL-6, interleukin 6; IL-8, interleukin 8.

Laayouni et al. www.pnas.org/cgi/content/short/1317723111 7 of 8


Fig. S9. Tracks of two tests of selection, XP-CLR and cross-population extended haplotype homozogysity (XP-EHH) (based on linkage disequilibrium), in three
populations (Europeans, CEU; Asians from China, CHB; and Africans, Yoruba YRI) from 1000 Genome Project in a region of ∼1 Mb surrounding the TLR2 gene
cluster. It shows a strong signal in CEU, not seen in Africans, and minor in Asians. It is interesting to note the strong peak in the region around 38.5 Mb, with
a long noncoding RNA of unknown function annotated (RP11-83C7.2).

Laayouni et al. www.pnas.org/cgi/content/short/1317723111 8 of 8

You might also like