You are on page 1of 16

Erratum 14 December 2017. See erratum.

R ES E A RC H

◥ The most significantly associated single-


RESEARCH ARTICLE SUMMARY nucleotide polymorphisms were at SLC24A5, a
gene associated with pigmentation in Europeans.
We show that SLC24A5 was introduced into East
HUMAN EVOLUTION
Africa >5 thousand years ago (ka) and has risen
to high frequency.
Loci associated with skin pigmentation The second most significantly associated re-
gion is near the gene MFSD12. Using in vitro and
in vivo analyses, we show that MFSD12 codes for
identified in African populations a lysosomal protein that modifies pigmentation
in human melanocytes, with decreased MFSD12
Nicholas G. Crawford, Derek E. Kelly,* Matthew E. B. Hansen,* Marcia H. Beltrame,* ◥
expression associated with
Shaohua Fan,* Shanna L. Bowman,* Ethan Jewett,* Alessia Ranciaro, Simon Thompson, ON OUR WEBSITE darker pigmentation. We
Yancy Lo, Susanne P. Pfeifer, Jeffrey D. Jensen, Michael C. Campbell, William Beggs, also show that genetic
Read the full article
Farhad Hormozdiari, Sununguko Wata Mpoloka, Gaonyadiwe George Mokone, at http://dx.doi. knockouts of MFSD12 or-
Thomas Nyambo, Dawit Wolde Meskel, Gurja Belay, Jake Haut, NISC Comparative org/10.1126/ thologs affect pigmenta-
Sequencing Program, Harriet Rothschild, Leonard Zon, Yi Zhou, Michael A. Kovacs, science.aan8433 tion in both zebrafish
..................................................
Mai Xu, Tongwu Zhang, Kevin Bishop, Jason Sinclair, Cecilia Rivas, Eugene Elliot, and mice.
Jiyeon Choi, Shengchao A. Li, Belynda Hicks, Shawn Burgess, Christian Abnet, A third highly associated region encompasses
Dawn E. Watkins-Chow, Elena Oceana, Yun S. Song, Eleazar Eskin, Kevin M. Brown, a cluster of genes that play a role in ultraviolet

Downloaded from http://science.sciencemag.org/ on August 18, 2019


Michael S. Marks,† Stacie K. Loftus,† William J. Pavan,† Meredith Yeager,† (UV) response and DNA damage repair. We find
Stephen Chanock,† Sarah A. Tishkoff ‡ the strongest associations in a regulatory region
upstream of DDB1, the gene encoding damage-
INTRODUCTION: Variation in pigmentation pigmentation in Africans informs upon melano- specific DNA binding protein 1, and that these
among human populations may reflect local cyte biology and the evolution of skin pigmen- variants are associated with increased expres-
adaptation to regional light environments, tation in humans. sion of DDB1. The alleles associated with light pig-
because dark skin is more photoprotective, mentation swept to near fixation outside of Africa
whereas pale skin aids the production of vitamin RESULTS: We observe extensive variation in due to positive selection, and we show that these
D. Although genes associated with skin pigmen- skin pigmentation in Africa, with lowest mela- lineages coalesce ~60 ka, corresponding with the
tation have been identified in European pop- nin levels observed in southern African San time of migration of modern humans out of Africa.
ulations, little is known about the genetic basis hunter-gatherers and highest levels in East The fourth significantly associated region en-
of skin pigmentation in Africans. African Nilo-Saharan pastoralists. A genome- compasses the OCA2 and HERC2 loci. We identify
wide association study (GWAS) of 1570 Africans previously uncharacterized variants at HERC2
RATIONALE: Genetically and phenotypically identified variants significantly associated with associated with the expression of OCA2. These
diverse African populations are informative for skin pigmentation, which clustered in four ge- variants arose independently from eye and skin
mapping genetic variants associated with skin nomic regions that together account for almost pigmentation–associated variants in non-Africans.
pigmentation. Analysis of the genetics of skin 30% of the phenotypic variation. We also identify variants at OCA2 that are cor-
related with alternative splicing; alleles asso-
ciated with light pigmentation are correlated
with a shorter transcript, which lacks a trans-
membrane domain.

CONCLUSION: We identify previously unchar-


acterized genes and variants associated with
skin pigmentation in ethnically diverse Africans.
These genes have diverse functions, from re-
pairing UV damage to playing important roles
in melanocyte biology. We show that both dark
and light pigmentation alleles arose before the
origin of modern humans and that both light
and dark pigmented skin has continued to evolve
throughout hominid history. We show that var-
iants associated with dark pigmentation in
Africans are identical by descent in South Asian
and Australo-Melanesian populations. This study
sheds light on the evolutionary history, and
adaptive significance, of skin pigmentation
in humans.

The list of author affiliations is available in the full article online.
*These authors contributed equally to this work.
GWAS and functional assays illuminate the genetic basis of pigmentation in Africa. A GWAS identified †These authors contributed equally to this work.
four genomic regions associated with skin pigmentation in Africa. Functional assays in melanocytes, ‡Corresponding author. Email: tishkoff@pennmedicine.
upenn.edu
zebrafish, and mice characterized their impact on skin pigmentation. Evolutionary genetic analyses Cite this article as N. G. Crawford et al., Science 358, eaan8433
revealed that most derived variants evolved before the origin of modern humans. Ma, million years ago. (2017). DOI: 10.1126/science.aan8433

Crawford et al., Science 358, 887 (2017) 17 November 2017 1 of 1


Erratum 14 December 2017. See erratum.
R ES E A RC H

◥ then transferred to keratinocytes (2). Melanosome


RESEARCH ARTICLE morphology and content differ between melano-
cytes that synthesize mainly eumelanins (black-
brown pigments) or pheomelanins (pigments that
HUMAN EVOLUTION range from yellow to reddish brown) (3). Varia-
tion in skin pigmentation is due to the type and

Loci associated with skin pigmentation quantity of melanins generated, melanosome size,
and the manner in which keratinocytes sequester
and degrade melanins (4).
identified in African populations Although more than 350 pigmentation genes
have been identified in animal models, only a
subset of these genes have been linked to normal
Nicholas G. Crawford,1 Derek E. Kelly,1,2* Matthew E. B. Hansen,1* Marcia H. Beltrame,1*
variation in humans (5). Of these, there is limited
Shaohua Fan,1* Shanna L. Bowman,3,4* Ethan Jewett,5,6* Alessia Ranciaro,1 knowledge about loci that affect pigmentation in
Simon Thompson,1 Yancy Lo,1 Susanne P. Pfeifer,7 Jeffrey D. Jensen,7 populations with African ancestry (6, 7).
Michael C. Campbell,1,8 William Beggs,1 Farhad Hormozdiari,9,10
Sununguko Wata Mpoloka,11 Gaonyadiwe George Mokone,12 Thomas Nyambo,13 Skin pigmentation is highly variable
Dawit Wolde Meskel,14 Gurja Belay,14 Jake Haut,1 NISC Comparative Sequencing Program,† within Africa
Harriet Rothschild,15 Leonard Zon,15,16 Yi Zhou,15,17 Michael A. Kovacs,18 Mai Xu,18 To identify genes affecting skin pigmentation in
Tongwu Zhang,18 Kevin Bishop,19 Jason Sinclair,19 Cecilia Rivas,20 Eugene Elliot,20 Africa, we used a DSM II ColorMeter to quantify
Jiyeon Choi,18 Shengchao A. Li,21,22 Belynda Hicks,21,22 Shawn Burgess,19 light reflectance from the inner arm as a proxy

Downloaded from http://science.sciencemag.org/ on August 18, 2019


Christian Abnet,21 Dawn E. Watkins-Chow,20 Elena Oceana,23 Yun S. Song,5,6,24,25,26 for melanin levels in 2101 ethnically and genet-
Eleazar Eskin,27 Kevin M. Brown,18 Michael S. Marks,3,4‡ Stacie K. Loftus,20‡ ically diverse Africans living in Ethiopia, Tanzania,
William J. Pavan,20‡ Meredith Yeager,21,22‡ Stephen Chanock,21‡ Sarah A. Tishkoff 1,25§ and Botswana (table S1 and figs. S1 and S2) (8).
Skin pigmentation levels vary extensively among
Despite the wide range of skin pigmentation in humans, little is known about its genetic Africans, with darkest pigmentation observed in
basis in global populations. Examining ethnically diverse African genomes, we identify Nilo-Saharan–speaking pastoralist populations in
variants in or near SLC24A5, MFSD12, DDB1, TMEM138, OCA2, and HERC2 that are eastern Africa and lightest pigmentation observed
significantly associated with skin pigmentation. Genetic evidence indicates that the in San hunter-gatherer populations from south-
light pigmentation variant at SLC24A5 was introduced into East Africa by gene flow from ern Africa (Fig. 2 and table S1).
non-Africans. At all other loci, variants associated with dark pigmentation in Africans
are identical by descent in South Asian and Australo-Melanesian populations.
A locus associated with light skin color
Functional analyses indicate that MFSD12 encodes a lysosomal protein that affects
in Europeans is common in East Africa
melanogenesis in zebrafish and mice, and that mutations in melanocyte-specific We genotyped 1570 African individuals with
regulatory regions near DDB1/TMEM138 correlate with expression of ultraviolet response quantified pigmentation levels using the Illu-
genes under selection in Eurasians. mina Infinium Omni5 Genotyping array. After
quality control, we retained ~4.2 million biallelic

V
single-nucleotide polymorphisms (SNPs) for anal-
ariation in epidermal pigmentation is a Asians, and Australo-Melanesians) have darker ysis. A genome-wide association study (GWAS)
striking feature of modern humans. Human pigmentation (Fig. 1), which likely mitigates the analysis with linear mixed models, controlling
pigmentation is correlated with geographic negative impact of high UVR exposure, such as for age, sex, and genetic relatedness (9), identi-
and environmental variation (Fig. 1). Pop- skin cancer and folate degradation (1). In con- fied four regions with multiple significant asso-
ulations at lower latitudes have darker trast, the synthesis of vitamin D3 in response to ciations (P < 5 × 10−8) (Fig. 1, fig. S3, and tables S2
pigmentation than those at higher latitudes, sug- UVR, needed to prevent rickets, may drive selec- and S3).
gesting that skin pigmentation is an adaptation tion for light pigmentation at high latitudes (1). We then performed fine-mapping using local
to differing levels of ultraviolet radiation (UVR) The basal layer of human skin contains melano- imputation of high-coverage sequencing data
(1). Because equatorial regions receive more UVR cytes, specialized pigment cells that harbor sub- from a subset of 135 individuals and data from
than temperate regions, populations from these cellular organelles called melanosomes, in which the Thousand Genomes Project (TGP) (Fig. 3 and
regions (including sub-Saharan Africans, South melanin pigments are synthesized and stored and table S3) (10). We ranked potential causal variants

1
Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA. 2Genomics and Computational Biology Graduate Program, University of
Pennsylvania, Philadelphia, PA 19104, USA. 3Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia Research Institute, Philadelphia, PA 19104, USA. 4Department
of Pathology and Laboratory Medicine and Department of Physiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA. 5Department of Electrical
Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA 94704, USA. 6Department of Statistics, University of California, Berkeley, Berkeley, CA 94704, USA. 7School
of Life Sciences, Arizona State University, Tempe, AZ 85287, USA. 8Department of Biology, Howard University, Washington, DC 20059, USA. 9Department of Epidemiology, Harvard T.H. Chan
School of Public Health, Boston, MA 02115, USA. 10Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA
02142, USA. 11Department of Biological Sciences, University of Botswana, Gaborone, Botswana. 12Department of Biomedical Sciences, University of Botswana School of Medicine, Gaborone,
Botswana. 13Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania. 14Department of Biology, Addis Ababa University, Addis Ababa, Ethiopia.
15
Stem Cell Program, Division of Hematology and Oncology, Pediatric Hematology Program, Boston Children’s Hospital and Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA
02115, USA. 16Howard Hughes Medical Institute, Harvard Medical School, Boston, MA 02115, USA. 17Harvard Stem Cell Institute, Harvard University, Cambridge, MA 02138, USA. 18Laboratory of
Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA. 19Translational and Functional
Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA. 20Genetic Disease Research Branch, National Human Genome Research
Institute, National Institutes of Health, Bethesda, MD 20892, USA. 21Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, MD 20892,
USA. 22Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., Frederick, MD 21701, USA. 23Department of Molecular Pharmacology, Physiology and Biotechnology,
Brown University, Providence, RI 02912, USA. 24Chan Zuckerberg Biohub, San Francisco, CA 94158, USA. 25Department of Biology, School of Arts and Sciences, University of Pennsylvania,
Philadelphia, PA 19104, USA. 26Department of Mathematics, University of Pennsylvania, Philadelphia, PA 19104, USA. 27Department of Computer Science and Department of Human Genetics,
University of California, Los Angeles, Los Angeles, CA 90095, USA.
*These authors contributed equally to this work. †National Institutes of Health Intramural Sequencing Center (NISC) Comparative Sequencing Program collaborators and affiliations are listed in the supplementary
materials. ‡These authors contributed equally to this work. §Corresponding author. Email: tishkoff@pennmedicine.upenn.edu

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 1 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

within each locus using CAVIAR, a fine-mapping to other genes containing MFS domains, con- predicted regulatory regions active in melano-
method that accounts for linkage disequilibrium served throughout vertebrates, which function as cytes and/or keratinocytes (Table 1 and Fig. 3)
(LD) and effect sizes (Table 1) (11). We char- transmembrane solute transporters (25). MFSD12 and show enhancer activity in luciferase expres-
acterized global patterns of variation at these mRNA levels are low in depigmented skin of vit- sion assays in a WM88 melanoma cell line (Table 1,
loci using whole-genome sequences from West iligo patients (26), likely due to autoimmune- table S5, and fig. S7). Within MFSD12, the two
African, Eurasian, and Australo-Melanesian pop- related destruction of melanocytes. SNPs that CAVIAR identifies as having the highest
ulations (10, 12, 13). The MFSD12 locus is in a region with extensive probability of being causal are rs56203814 (F test,
The SNPs with strongest association with skin recombination, enabling us to fine-map eight P = 3.6 × 10−18), a synonymous variant within
color in Africans were on chromosome 15 at or potentially causal SNPs (Table 1 and table S3) exon 9, and rs10424065 (F test, P = 5.1 × 10−20),
near the solute carrier family 24 member 5 that cluster in two regions: one within MFSD12 located within intron 8. They are 130 bp apart,
(SLC24A5) gene (Figs. 1 and 3 and tables S2 and the other ~7600 to 9000 base pairs (bp) up- are in strong LD, and affect gene expression in
and S3). A functional nonsynonymous mutation stream of MFSD12 (Fig. 3). Many SNPs are in luciferase expression assays (1.5 to 2.7× higher
within SLC24A5 (rs1426654) (14) was significant-
ly associated with skin color (F test, P = 5.5 ×
10−62) and was identified as potentially causal
by CAVIAR (Table 1). The rs1426654 (A) allele Fig. 1. Correlations A B
is at high frequency in European, Pakistani, and between allele fre-
Indian populations (Fig. 1) and is a target of se- quencies at loci
lection in Europeans, Central Asians, and North associated with pig-
Indians (15–18). In Africa, this variant is com- mentation and UV
mon (28 to 50% frequency) in populations from exposure in global

Downloaded from http://science.sciencemag.org/ on August 18, 2019


Ethiopia and Tanzania with high Afro-Asiatic populations.
ancestry (19, 20) and is at moderate frequency (A) Global variation in
(5 to 11%) in San and Bantu-speaking populations skin pigmentation indi- 20 40 60 80 100 120 20 80 140 200 260 320 380
from Botswana with low levels of East African cated by melanin index
C D
2
Melanin index Erythemal dose rate (mW/m )

ancestry and recent European admixture (Fig. (MI). These data were 70 –log10(P-value)
p < 5.0 10 -8
SLC24A5 QQ plot 70

1 and figs. S2 and S4) (21, 22). We observe a


60 60
integrated with MI data 50
40
50
40
30 MFSD12 30
signature consistent with positive selection at for global populations 20
10
DDB1 HERC2/OCA2 20
10
SLC24A5 in Europeans based on extreme values from (1, 105). (B) Mean 0
1 2 3 4 5 6 7 8
Chromosomes
9 10 11 12 13 14 15 16 17 18 19 20 2122 0 2 4 6
0

of Tajima’s D statistic (fig. S5). erythemal dose rate. E F


On the basis of coalescent analysis with se- (C) Manhattan plot of
quence data from the Simons Genomic Diversity −log10 transformed
CEU
Project (SGDP) (13), the time to most recent com- P values from GWAS ASW
2
mon ancestor (TMRCA) of most Eurasian lineages of skin pigmentation ACB 1 3
5 4 MEL
containing the rs1426654 (A) allele is 29 thousand with the Illumina 6 7 PNG
years ago (ka) [95% critical interval (CI), 28 to Omni5 SNP array. G
8
9
10
T
31 ka], consistent with previous studies (15, 17) (D) Quantile-quantile A C

(Fig. 4). Haplotype analysis indicates that the SLC24A5 - rs1426654 (15:48426484) MFSD12 - rs10424065 (19:3545022)
(QQ) plot of observed G H
rs1426654 (A) variant in Africans is on the same versus expected P
extended haplotype background as Europeans values from the GWAS.
(Fig. 5 and fig. S6), likely reflecting gene flow In both (C) and (D),
from western Eurasia over at least the past 3 to significant SNPs at P <
9 ky (23). The rs1426654 (A) variant is at high 5 × 10−8 are highlighted
frequency (28%) in Tanzanian populations, sug- in purple. (E to L) Allele A T
gesting a lower bound (~5 ka) for introduction of frequencies of genetic G A

this allele into East Africa, the time of earliest variants associated
migration from Ethiopia into Tanzania (24). Fur-
I MFSD12 - rs6510760 (19:3565253)
J TMEM138 - rs7948623 (11:61137147)

with skin pigmentation


thermore, the frequency of the rs1426654 (A) in global populations.
variant in eastern and southern Africans exceeds African populations
the inferred proportion of non-African ancestry included are Ethiopia
(figs. S2 and S4). Estimates of genetic differenti- Nilo-Saharan (1),
ation (FST) at the rs1426654 SNP between the Ethiopia Omotic (2),
C C
West African Yoruba (YRI) and Ethiopian Amhara Ethiopia Semitic T T
populations is 0.76, among the top 0.01% of values (3), Ethiopia and
on chromosome 15 (table S4). These results are Tanzania Cushitic (4),
K DDB1 - rs11230664 (11:61076372)
L OCA2 - rs1800404 (15:28235773)

consistent with selection for the rs1426654 (A) Tanzania Nilo-Saharan


allele in African populations following introduc- (5), Tanzania Hadza
tion, although complex models of demographic (6), Tanzania Sandawe
history cannot be ruled out. (7), Botswana Bantu
(8), Botswana
A lysosomal transporter protein San/Bantu admixed T A
C G
associated with skin pigmentation (9), and Botswana San
HERC2 - rs4932620 (15:28514281) HERC2 - rs6497271 (15:28365431)
The region with the second strongest genetic (10). The Melanesian
association with skin pigmentation contains the (MEL) samples are from (12), and the Australian Aboriginal and Papua New Guinean samples (merged)
major facilitator superfamily domain containing are from the SGDP (PNG) (13). All other populations are from the TGP (10). Non-Aboriginal populations
12 (MFSD12) gene on chromosome 19 (Figs. 1 and in the Americas are indicated: CEU (European ancestry), ASW (African-American Southwest U.S.),
3 and tables S2 and S3). MFSD12 is homologous and ACB (African Caribbean in Barbados).

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 2 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

Fig. 2. Melanin distribu- Botswana San rived rs6510760 (A) and rs112332856 (C) alleles
N = 358
tions. Histograms of melanin (associated with dark pigmentation) are common
index computed from under- in all sub-Saharan Africans except the San, as
arm measurements with a well as in South Asian and Australo-Melanesian
DSM II ColorMeter for all populations (Fig. 1 and fig. S4). Haplotype anal-
individuals in each population Botswana San/Bantu N = 106 ysis places the rs6510760 (A) allele [and linked
as described in (70). Skin rs112332856 (C) allele] in Australo-Melanesians
tones were visualized by on similar haplotype backgrounds relative to
displaying the scaled mean central and eastern Africans (Fig. 5 and fig.
red, green, and blue values S6), suggesting that they are identical by de-
from the ColorMeter for Ethiopia Semitic N = 193 scent from an ancestral African population. Co-
individuals binned by alescent analysis of the SGDP data set indicates
melanin index. that the TMRCA for the derived rs6510760 (A)
allele is 996 ka [95% CI, 0.82 to 1.2 million years
ago (Ma); Fig. 4].
Botswana Bantu N = 292 We do not detect evidence for positive selec-
tion at MFSD12 using Tajima’s D and iHS statis-
tics [figs. S5 and S8; as expected if selection were
ancient (28)]. However, levels of genetic differen-
tiation are elevated when comparing East African
Tanzania Sandawe N = 98 Nilo-Saharan and western European (CEU) pop-

Downloaded from http://science.sciencemag.org/ on August 18, 2019


ulations (for example, FST = 0.85 for rs112332856,
top 0.05% on chromosome 19), consistent with dif-
ferential selection at this locus (table S4) (29).
MFSD12 is within a cluster of 10 genes with high
Ethiopia and Tanzania Cushitic N = 292 expression levels in primary human melanocytes
relative to primary human keratinocytes (30), with
MFSD12 as the most differentially expressed (90×;
table S6). The genomic region (chr19:3541782-
3581062) encompassing MFSD12 and neighbor-
Ethiopia Omotic N = 155 ing gene HMG20B (a transcription factor common
in melanocytes) has numerous deoxyribonuclease
(DNase) I hypersensitive sites (DHS) and is en-
riched for H3K27ac enhancer marks in melano-
cytes (top 0.1% genome-wide; Fig. 3), suggesting
Tanzania NiloSaharan N = 103 that this region may regulate expression of genes
critical to melanocyte function (31).
Analyses of gene expression using RNA sequenc-
ing (RNA-seq) data from 106 primary melanocyte
cultures (table S7) indicate that African ancestry
Tanzania Hadza N = 379 is significantly correlated with decreased MFSD12
gene expression [Pearson correlation coefficient
(PCC), P = 5.0 × 10−2; fig. S9]. We observed
significant associations between genotypes at
rs6510760 and rs112332856 with expression of
Ethiopia NiloSaharan N = 112 HMG20B [Bonferroni-adjusted P (Padj) < 4.9 ×
10−3] and MFSD12 (Padj < 3.4 × 10−2) (fig. S9). In
each case, the alleles associated with dark pig-
mentation correlate with decreased gene expres-
sion. Allele-specific expression (ASE) analysis
50 75 100 125 indicates that individuals heterozygous for either
Melanin index rs6510760 or rs112332856 show increased allelic
imbalance, relative to homozygotes, for MFSD12
(Mann-Whitney-Wilcoxon test, P = 4.9 × 10−3 and
1.3 × 10−2, respectively), consistent with regulation
expression than the minimal promoter; fig. S7). in East African populations with Nilo-Saharan of gene expression in cis. A haplotype containing
The SNPs upstream of MFSD12 with highest ancestry (Fig. 1 and fig. S4). Coalescent analysis the rs6510760 (A)/rs112332856 (C) variants asso-
probability of being causal are rs112332856 (F test, of the SGDP data set indicates that the rs10424065 ciated with dark pigmentation showed 4.9 times
P = 3.8 × 10−16) and rs6510760 (F test, P = 6.5 × (T) allele predates the 300-ka origin of modern lower expression in luciferase assays than the
10−15). They are 346 bp apart, are in strong LD, humans (estimated TMRCA of 612 ka; 95% CI, haplotype containing rs6510760 (G)/rs112332856
and affect gene expression in luciferase expres- 515 to 736 ka) (Fig. 4) (27). (T) variants associated with light pigmentation
sion assays (4.0 to 19.7× higher expression than At rs6510760 and rs112332856, the ancestral (Kruskal-Wallis rank-sum test, P = 7.7 × 10−7; fig.
the minimal promoter; fig. S7). (G) and (T) alleles, respectively, associated with S7 and table S5). We did not have power to detect
The derived rs56203814 (T) and rs10424065 (T) light pigmentation, are nearly fixed in Europeans an association between expression of MFSD12
alleles associated with dark pigmentation are and East Asians and are common in San as well and rs56203814 or rs10424065 due to low fre-
present only in African populations (or those of as Ethiopian and Tanzanian populations with quency (~2%) of the alleles associated with dark
recent African descent) and are most common Afro-Asiatic ancestry (Fig. 1 and fig. S4). The de- pigmentation in the primary melanocyte cultures.

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 3 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

MFSD12 suppresses eumelanin SLC24A5 locus


biogenesis in melanocytes A rs1426654
60
from lysosomes

–log10(P-value)
40

We silenced expression of the mouse ortholog 20

of MFSD12 (Mfsd12) using short hairpin RNAs 0


48400000 48420000 48440000 48460000

(shRNAs) in immortalized melan-Ink4a mouse MITF

melanocytes derived from C57BL/6J-Ink4a−/− mice CTCF

Keratinocytes
(32), which almost exclusively make eumelanin H3K27ac
DHS

(Fig. 6). Reduction of Mfsd12 mRNA by ~80%


chromHMM

Melanocytes
H3K27ac
with two distinct lentivirally encoded shRNAs DHS
chromHMM

(Fig. 6A) caused a 30 to 50% increase in melanin SLC24A5


MYEF2
content compared to control cells (Fig. 6B), and a
higher percentage of melanosomes per total cell B MFSD12 locus
rs10424065
area in most cells compared to cells transduced 20
rs56203814 rs10414812
rs112332856

–log10(P-value)
rs111317445
with nontarget shRNA (Fig. 6, C and D). A frac- 10
rs73527942 rs142317543
rs6510760

tion of MFSD12-depleted cells harbored large


clumps of melanin in autophagosome-like struc- 0
3540000 3550000 3560000 3570000

tures (fig. S10). These data suggest that MFSD12 MITF

suppresses eumelanin content in melanocytes and CTCF


Keratinocytes
may offset autophagy. H3K27ac
DHS
chromHMM
We assessed the localization of human MFSD12

Downloaded from http://science.sciencemag.org/ on August 18, 2019


Melanocytes
H3K27ac
isoform c (RefSeq NM_174983.4) tagged at the C DHS
chromHMM

terminus with the hemagglutinin (HA) epitope ChIA-PET

(MFSD12-HA). By immunofluorescence micros- 2

copy, MFSD12-HA localized to punctate structures FZR1 1 HMG20B

throughout the cell. Surprisingly, these puncta, MFSD12 3’... A G C C C G A G G C G


T ...5’
TFAP2A motif
like those labeled by the endogenous lysosomal
membrane protein LAMP2, but not the melano- C DDB1/TMEM138 locus
rs11230664 rs7120594 rs12289370 rs7948623 rs1377457
rs148172827
somal enzyme TYRP1, overlapped only weakly 9 rs12275843 rs7934735
–log10(P-value)

6
with pigmented melanosomes (Fig. 6, E to G; 3

quantified in Fig. 6H). Instead, MFSD12-HA 0


61040000 61080000 61120000 61160000
colocalized with LAMP2 (Fig. 6E; quantified in MITF
Fig. 6H), indicating that the MFSD12 protein CTCF
localizes to late endosomes and/or lysosomes Keratinocytes
H3K27ac
in melanocytes and not to eumelanosomes. DHS
chromHMM

Melanocytes
H3K27ac

MFSD12 influences pigmentation in


DHS
chromHMM

zebrafish xanthophore pigment cells


ChIA-PET
We targeted transmembrane domain 2 (TMD2)
in the highly conserved zebrafish ortholog of
mfsd12a with CRISPR-Cas9 (Fig. 7). We focused on VWCE DDB1
TKFC
mfsd12a because its paralog mfsd12b is predicted CYB561A3 TMEM138
TMEM216

to be a pseudogene (33). Although pigmentation


was not qualitatively altered in melanophores, D OCA2/HERC2 locus
rs4932620
10
rs1800404
the cells that make eumelanin, compound hetero- rs6497271
–log10(P-value)

zygotes of mfsd12a alleles exhibited reduced 5

staining of xanthophores, the cells responsible 0


28300000 28400000 28500000
for pteridine-based yellow pigmentation in wild- MITF

type zebrafish (Fig. 7, A and B) (34, 35). This CTCF

was not due to a failure of the xanthophores to Keratinocytes


H3K27ac

develop in mfsd12a mutants, because green flu-


DHS
chromHMM
Melanocytes
orescent protein (GFP)–labeled xanthophores H3K27ac
DHS
chromHMM
were robust along the lateral line in both wild- HERC2
type and mfsd12a mutant zebrafish (Fig. 7, C and OCA2

D). Together, these results suggest that MFSD12


Noncoding
influences xanthophore pigment production in Synonymous
P < 5 x 10-8
-8
P > 5 x 10
pterinosomes. Nonsynonymous

Functional characterization of MFSD12 Fig. 3. Genomic context of GWAS loci. Plot of −log10(P value) versus genomic position for variants
in mice near the four regions with most strongly associated SNPs from GWAS, including annotations for
CRISPR-Cas9 was used to generate an Mfsd12 genes, MITF ChIP-seq (chromatin immunoprecipitation sequencing) data for melanocytes (48),
null allele in a wild-type mouse background (Fig. a CTCF ChIP-seq track for NHEK keratinocytes, and H3K27ac, DNase I hypersensitive sites (DHS),
7E and fig. S11). Four founders were observed and chromHMM tracks for melanocytes and keratinocytes from the Roadmap Epigenomics data
with a uniformly gray coat color, rather than set (30). Genome-wide significant variants are highlighted in red. Circles, squares, and triangles denote
the expected agouti coat color (fig. S11, A and B). noncoding, synonymous, and nonsynonymous variants, respectively. (A) SLC24A5 locus. (B) MFSD12
These four gray founders harbored deletions at locus. (C) DDB1/TMEM138 locus. (D) OCA2/HERC2 locus.

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 4 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

the targeted site (fig. S11C). Microscopic observa-


A B C
tion revealed a lack of pheomelanin, resulting in 29 ka
(28-31)
white, rather than yellow, banding of hairs in
Mfsd12 mutants (Fig. 7F). 996 ka

The Mfsd12 knockout coat color appeared phe- 139 ka


(108-193)
900 ka
(822 K - 1.2 M)

notypically similar to that of grizzled (gr) mice,


612 ka (515-736) 1.1 Ma (925 ka -1.3 Ma)
773 ka (603 ka - 1.3 Ma)
(695-860) 1.6 Ma (1.3 - 2.1)

an allele previously mapped to a syntenic ~2-Mb 658 ka


(573-755)
1.3 Ma (1.1-1.5) 135 ka
(103-211)

region overlapping Mfsd12 (36). Like our CRISPR- 552 ka


(474-681)
129 ka

Cas9 Mfsd12 knockout, homozygous gr/gr mice


(115-158)

are characterized by a gray coat resulting from


dilution of yellow pheomelanin pigment from the MFSD12 - rs10424065 (19:3545022) MFSD12- rs6510760 (19:3565253)
SLC24A5 - rs1426654 (15:48426484)
subterminal agouti band of the hair shaft. Exome
sequencing of an archived gr/gr DNA sample, D E F
subsequently confirmed by Sanger sequencing
in an independent colony, identified a 9-bp in- 273 ka
(222 - 353)

frame deletion within exon 2 of Mfsd12 (fig. S12) 629 ka


(426 - 848)
60 ka
as the sole mutation affecting a coding sequence (58 - 62)
953 ka 812 ka
679 ka 559 ka
in this mapped candidate region. The deleted (838 ka - 1.1 Ma)
(596 - 756)
(740 - 896) (505 - 607)
1.7 Ma

amino acids for the gr/gr allele, Mfsd12 p.Leu163_ 857 ka (1.5 - 2.)
( 705 ka - 1.1 Ma) 1.1 Ma
(812 - 1.5)

Ala165del, are in the cytoplasmic loop between


the transmembrane domains TM4 and TM5 with- 19 ka (18 - 20)
85 ka
(80 - 95)

Downloaded from http://science.sciencemag.org/ on August 18, 2019


in a highly conserved MFS domain (fig. S13).
These results indicate that mutation of Mfsd12 TMEM138 - rs7948623 (11:61137147) DDB1 - rs11230664 (11:61076372) OCA2 - rs1800404 (15:28235773)
is responsible for the gray coat color of gr/gr
Populations:
mutant mice, and that loss of Mfsd12 reduces G H West Africa
pheomelanin within the hairs of agouti mice. East Africa
Together, these results indicate that MFSD12 29 ka
North Africa
247 ka
(27-32) Pygmy
plays a conserved role in vertebrate pigmenta- (158-345) 949 ka
(824K - 1.1M) San
tion. Depletion of MFSD12 increases eumelanin 915 ka
921 ka
(703K - 1.3M) 1.7 Ma
Oceania
content in a cell-autonomous manner in skin (852 - 992)
(1.6 - 1.9 M)
West Eurasia
Middle East
melanocytes, consistent with the lower levels of Central Asia/Siberia
MFSD12 expression observed in melanocytes 114 ka
(95-145) South Asia
from individuals with African ancestry. Because East Asia
America
MFSD12 localizes to lysosomes and not to eume- Light allele
HERC2 - rs4932620 (15:28514281) HERC2 - rs6497271 (15:28365431)
lanosomes, this may reflect an indirect effect
through modified lysosomal function. By con-
Fig. 4. Coalescent trees and TMRCA dating. Inferred genealogies for regions flanking candidate
trast, loss of MFSD12 has the opposite effect
causal loci. Each leaf corresponds to a single sampled chromosome from 1 of 278 individuals in
on pheomelanin production, reflecting a more
the Simons Genome Diversity Project (13). Leaf nodes are colored by the population of origin
direct effect on function of pheomelanosomes,
of the individual, and sequences carrying the light allele are indicated with an open circle, located
which have a distinct morphology (3), gene ex-
next to the leaf node. Node heights and 95% CI are presented for a subset of internal nodes. Gene
pression profile (37), and, like zebrafish pterino-
genealogies are shown for regions flanking (A) SLC24A5, rs1426654 (15:48426484); (B) MFSD12,
somes, a potentially different intracellular origin
rs10424065 (19:3545022); (C) MFSD12, rs6510760 (19:3565253); (D) TMEM138, rs7948623
from eumelanosomes (38). Although disruption of
(11:61137147); (E) DDB1, rs11230664 (11:61076372); (F) OCA2, rs1800404 (15:28235773);
MFSD12 alone accounts for changes in pigmenta-
(G) HERC2, rs4932620 (15:28514281); and (H) HERC2, rs6497271 (15:28365431).
tion, the role of neighboring loci such as HMG20B
on pigmentation remains to be explored.

Skin pigmentation–associated loci (<1% frequency) nonsynonymous mutations in A second group of tightly linked SNPs (LD r2 >
that play a role in UV response are the TGP data set. Genetic variants near DDB1 0.7 in East Africans) with predicted high probabil-
targets of selection were associated with human pigmentation in an ity of containing causal variants spans a ~195-kb
Another genomic region associated with pigmen- African population with high levels of European region encompassing DDB1 and TMEM138 (Table
tation encompasses a ~195-kb cluster of genes on admixture (7). 1 and Fig. 3). Two SNPs that tag this LD block are
chromosome 11 that play a role in UV response and Because of extensive LD in this region, CAVIAR rs1377457 (F test, P = 1.5 × 10−9), located ~7600 bp
melanoma risk, including the damage-specific identified 33 SNPs predicted to be causal (Table downstream of TMEM138, and rs148172827 (F test,
DNA binding protein 1 (DDB1) gene (Figs. 1 and 1). The most strongly associated SNPs are located P = 1.8 × 10−9), an insertion/deletion polymorphism
3 and table S3). DDB1 (complexed with DDB2 in a region conserved across vertebrates flanked at TKFC (triokinase and FMN cyclase) located in an
and XPC) functions in DNA repair (39); levels of by TMEM138 and TMEM216 (45) ~36 to 44 kb enhancer active in WM88 melanoma cells (67.6 to
DDB1 are regulated by UV exposure and MC1R upstream of DDB1 and are in high LD within this 76.2× higher than the minimal promoter; fig. S7
signaling, a regulatory pathway of pigmentation cluster (r2 > 0.7 in East Africans) (Fig. 3, Table 1, and table S5), which overlaps an MITF binding site
(40). DDB1 is a component of CUL4-RING E3 and table S3). Among these, the most significant- in melanocytes (30, 48); both SNPs interact with
ubiquitin ligases that regulate several cellular ly associated SNP is rs7948623 (F test, P = 2.2 × the promoters of DDB1 and neighboring genes in
and developmental processes (41); it is critical 10−11), located 172 bp downstream of TMEM138, MCF-7 cells (Table 1 and Fig. 3) (46, 47). SNPs
for follicle maintenance and female fertility in which shows enhancer activity in WM88 mela- within introns of DDB1 (rs12289370, rs7934735,
mammals (42) and for plastid size and fruit pig- noma cells (91.9 to 140.8× higher than the mini- rs11230664, rs12275843, and rs7120594) also tag
mentation in tomatoes (43). Knockouts of DDB1 mal promoter; fig. S7 and table S5) and interacts this LD block (Table 1 and Fig. 3).
orthologs are lethal in both mouse and fruitfly with the promoters of DDB1 and neighboring RNA-seq data from 106 primary melanocyte cul-
development (44), and DDB1 only exhibits rare genes in MCF-7 cells (Table 1 and Fig. 3) (46, 47). tures indicate that African ancestry is significantly

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 5 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

Downloaded from http://science.sciencemag.org/ on August 18, 2019


Fig. 5. Haplotype networks at SLC24A5, MFSD12, DDB1/TMEM138, and light pigmentation (yellow). (A) Region (75 kb) flanking the causal variant
OCA2/HERC2. Median-joining haplotype networks of regions containing at SLC24A5. (B and C) Regions (3 kb) flanking rs10424065 in MFSD12
candidate causal variants. Connections between circles indicate genetic and rs6510760 upstream of MFSD12. (D) Region (195 kb) flanking DDB1
relatedness, whereas size is relative to the frequency of haplotypes. Ancestry extending from PGA5 to SDHAF2. (E to G) Regions 1, 3, and 2
proportions are displayed as pie charts. Yellow and red subfigures indicate (50 kb) at OCA2 and HERC2 (ordered based on highest to lowest
which haplotypes contain the allele associated with dark pigmentation (red) or probability of being causal from CAVIAR analysis).

correlated with increased DDB1 gene expression DDB1 is most strongly associated with a SNP in correlates with increased DDB1 expression. We
(PCC, P = 2.6 × 10−5; fig. S9). Association tests an intron of DDB1, rs7120594, at marginal statis- did not have the power to detect an association
using a permutation approach indicated that, of tical significance after correction for ancestry and between expression of DDB1 and SNPs in LD with
the 35 protein-coding genes with a transcription multiple testing (Padj = 0.06; fig. S9). The allele rs7948623 due to low minor allele frequencies
start site within 1 Mb of rs7948623, expression of associated with dark pigmentation at rs7120594 (~2%). The role of DDB1 and neighboring loci

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 6 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

Downloaded from http://science.sciencemag.org/ on August 18, 2019


Fig. 6. MFSD12 suppresses eumelanin production but localizes to LAMP2 to mark lysosomes (E and G) or for TYRP1 to mark melanosomes
lysosomes. Immortalized melan-Ink4a melanocytes expressing nontarget (F), and analyzed by immunofluorescence and bright-field microscopy.
(sh NT) shRNA or either of two shRNA plasmid clones (#1 and 2) targeting Bright-field (melanin) images show pigmented melanosomes (pseudo-
Mfsd12 were analyzed for (A) Mfsd12 mRNA content by quantitative colored red in the merged images). Insets, 4× magnification of boxed
reverse transcription polymerase chain reaction (qRT-PCR), (B) melanin regions. Arrows, MFSD12-containing structures that overlap LAMP2
content by spectrophotometry, or (C) percentage of cell area containing (E) or TYRP1-containing structures that overlap melanosomes (F);
melanin by bright-field microscopy. (D) Quantification. Data in (A) to arrowheads, structures that do not overlap (G). Scale bars, 10 mm.
(C) represent means ± SEM, normalized to sh NT samples, from three (H) Quantification of overlap for structures labeled by MFSD12, TYRP1,
separate experiments. In (C), n (sh NT) = 97 cells, n (shMfsd12 #1) = LAMP2, and pigment. Data represent means ± SEM from three
68 cells, and n (shMfsd12 #2) = 71 cells. Scale bar, 10 mm (C). (E to independent experiments; n = 17 cells (MFSD12 overlap with LAMP2
G) Melan-Ink4a melanocytes transiently expressing MFSD12-HA (E) or and melanin), 33 cells (TYRP1 overlap with melanin), or 23 cells (LAMP2
not transfected (F and G) were fixed, immunolabeled for HA (E) and for and melanin).

in human pigmentation remains to be further nearly fixed in European, East Asian, and Native treme negative Tajima’s D values in East African
explored. American populations. Nilo-Saharans and San over a shorter distance
The derived rs7948623 (T) allele near TMEM138 In South Asians and Australo-Melanesians, the (115 and 100 kb, respectively) (fig. S5). A haplo-
(associated with dark pigmentation) is most com- alleles associated with darker pigmentation re- type extending greater than 195 kb is common in
mon in East African Nilo-Saharan populations side on haplotypes closely related, or identical, Eurasians and rare in Africans (Fig. 5) and tags
and is at moderate to high frequency in South to those observed in Africa (Fig. 5 and fig. S6), the alleles associated with light skin pigmenta-
Asian and Australo-Melanesian populations (Fig. suggesting that they are identical by descent. The tion. The TMRCA of a large number of haplotypes
1 and fig. S4). At SNP rs11230664, within DDB1, TMRCAs for the derived dark allele at rs7948623 carrying the rs7948623 (A) allele in non-Africans,
the ancestral (C) allele (associated with dark pig- and the derived light allele at rs11230664 are es- associated with light pigmentation, is 60 ka (95%
mentation) is common in all sub-Saharan African timated to be older than 600 and 250 ka, respec- CI, 58 to 62 ka), close to the inferred time of the
populations, having the highest frequency in East tively (Fig. 4). migration of modern humans out of Africa (Fig. 4)
African Nilo-Saharan, Hadza, and San populations Consistent with a selective sweep, we see an (49). These results, combined with large FST val-
(88 to 96%), and is at moderate to high frequency excess of rare alleles (and extreme negative ues between Africans and Europeans at SNPs
in South Asian and Australo-Melanesian popula- Tajima’s D values) and high levels of homozygosity tagging the extended haplotype near DDB1 (for
tions (12 to 66%) (Fig. 1 and fig. S4). The derived extending ~350 to 550 kb in Europeans and Asians, example, FST = 0.98 between Nilo-Saharans and
(T) allele (associated with light pigmentation) is respectively (figs. S5 and S14). We observe ex- CEU at rs7948623, within the top 0.01% of values

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 7 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

A B
and San (Fig. 1 and fig. S4), consistent with a
previous observation (56). Haplotype (Fig. 5) and
coalescent analyses (Fig. 4 and fig. S6) show two
divergent clades, one enriched for the rs1800404
(C) allele and the other for the rs1800404 (T)
allele. Coalescent analysis indicates that the
TMRCA of all lineages is 1.7 Ma (95% CI, 1.5 to
2.0 Ma), and the TMRCA of lineages containing
the derived (T) allele is 629 ka (95% CI, 426 to
848 ka) (Fig. 4). The deep coalescence of lineages,
and the positive Tajima’s D values in this region
C D in both African and non-African populations (fig.
S5), is consistent with balancing selection acting
at this locus.
The SNP with highest probability of being
causal in region 3 is rs4932620 (F test, P = 3.2 ×
10−9) located within intron 11 of HERC2 (Fig. 3,
Table 1, and table S3). This SNP is 917 bp from
rs916977, a SNP associated with blue eye color
in Europeans (57, 58), and is in strong LD (r2 =

E F 1.0 in most East African populations), with SNPs

Downloaded from http://science.sciencemag.org/ on August 18, 2019


extending into region 2 of HERC2 (Table 1). The
derived rs4932620 (T) allele associated with dark
skin pigmentation is most common in Ethiopian
populations with high levels of Nilo-Saharan an-
cestry and is at moderate frequency in other
Ethiopian, Hadza, and Tanzania Nilo-Saharan
populations (Fig. 1 and fig. S4). Haplotype anal-
ysis indicates that the rs4932620 (T) allele in
South Asians and Australo-Melanesians is on
the same or similar haplotype background as
in Africans (Fig. 5 and fig. S6), suggesting that
it is identical by descent. The TMRCA of haplo-
types containing the rs4932620 (T) allele is 247 ka
(95% CI, 158 to 345 ka) (Fig. 4).
We also observe an LD block of SNPs within
region 2 of HERC2 that are associated with skin
pigmentation, although they do not reach genome-
Fig. 7. In vivo zebrafish and mouse models of MFSD12 deficiency. (A and B) Representative images wide significance (table S3). These are in a region
of methylene blue staining in wild-type TAB5 (A) and compound heterozygous mutant mfsd12a zebrafish with enhancer activity in Europeans (50). For
(6 days postfertilization) (B). Note the absence of stained xanthophores in the mfsd12a mutant (B). example, SNP rs6497271 (F test, P = 1.8 × 10−6),
(C and D) No difference was observed in the number or distribution of xanthophores detected by mosaic which is located 437 bp from SNP rs12913832,
Tg(aox5:PALM-GFP) expression in injected wild-type TAB5 (C) or compound heterozygous mutant has been associated with skin color in Europeans
mfsd12a (D) zebrafish (5 days postfertilization). (E) Wild-type agouti mouse (left) with a gray Mfsd12- (50) and is in a consensus SOX2 motif (a tran-
targeted littermate (right). (F) Hair from the Mfsd12-targeted mouse has grossly normal eumelanin scription factor that modulates levels of MITF in
(lower black region of the hair shaft); however, the upper subapical yellow band in wild-type (E, left) melanocytes) (Fig. 3) (59). The ancestral rs6497271
appears white in the Mfsd12 mutant (E, right) due to a reduction in pheomelanin. (A) allele associated with dark pigmentation is
on haplotypes in South Asians and Australo-
Melanesians similar or identical to those in Afri-
on chromosome 11; table S4), are consistent with Because of extensive LD in the OCA2 and HERC2 cans (Fig. 5 and fig. S6), suggesting that they are
differential selection of alleles associated with region, CAVIAR predicted 10 potentially causal identical by descent. The derived (G) allele asso-
light and dark pigmentation in Africans and non- SNPs (Table 1) that cluster within three regions. ciated with light skin pigmentation is most com-
Africans at this locus. We order these clusters based on physical dis- mon in Europeans and San and dates to 921 ka
tance; region 1 is located within OCA2, and re- (95% CI, 703 ka to 1.2 Ma) (Figs. 1 and 4 and figs.
Identification of variation at OCA2 and gions 2 and 3 are located within introns of S4 and S6). SNPs associated with pigmentation at
HERC2 affecting skin pigmentation HERC2 (Fig. 3). all three regions show high allelic differentiation
Another region of significantly associated SNPs The SNP with highest probability of being when comparing East African Nilo-Saharans and
encompasses the OCA2 and HERC2 loci on chro- causal from CAVIAR analysis is rs1800404 (F test, CEU (FST = 0.72 to 0.85, top 0.5% on chromo-
mosome 15 (Fig. 3 and table S3). HERC2 was P = 1.0), a synonymous variant located in region some 15) (table S4).
identified in GWAS for eye, hair, and skin pig- 1 within exon 10 of OCA2 (Fig. 3, Table 1, and Analyses of RNA-seq data from 106 primary
mentation traits (5–7, 50–52). The oculocutaneous table S3) associated with eye color in Europeans melanocyte cultures indicate that African ances-
albinism II gene (OCA2, formerly called the P (55). The ancestral rs1800404 (C) allele, associ- try is significantly correlated with increased OCA2
gene) encodes a 12-transmembrane domain– ated with dark pigmentation, is common in most gene expression (PCC, Padj = 6.1 × 10−7) (fig. S9).
containing chloride transporter protein and af- Africans as well as southern and eastern Asians A permutation approach identified significant
fects pigmentation by modulating melanosomal and Australo-Melanesians, whereas the derived associations between OCA2 expression and SNPs
pH (53). The most common types of albinism in (T) allele, associated with light pigmentation, is within an LD block tagged by rs4932620 extend-
Africans are caused by mutations in OCA2 (54). most common (frequency >70%) in Europeans ing across regions 2 and 3 (Padj = 2.2 × 10−2).

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 8 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

Table 1. Annotations of candidate causal SNPs from GWAS. Top types (DHS other) available from Roadmap Epigenomics are indicated with
candidate causal variants for the four regions identified based on analysis X (30, 92). Variants intersecting enhancer regions tested by luciferase
with CAVIAR (11). For each variant, the genomic position (Location), RSID, assays were labeled with Y (significant enhancer activity) or N (no
and Ancestral>Derived alleles are shown, with the allele associated with enhancer activity) (fig. S7). Chromatin interactions with nearby genes
dark pigmentation in bold. Beta and standard error [Beta(SE)] and the measured in MCF-7 or K562 cell lines as identified by ChIA-PET are listed
P values from the GWAS (F test, linear mixed model) are given. For with gene names (Chromatin interactions) (46, 47). SNPs that are in
functional genomic data, nearest genes are given and variants overlapping strong LD (r2 > 0.7 in East Africans) are numerically labeled in the column
DHS sites for melanocytes (E059) (DHS melanocytes) and/or other cell titled LD block.

DHS DHS Luciferase LD Nearest Chromatin


Location RSID Ancestral>Derived Beta(SE) P
melanocytes other activity block gene interactions

15:48485926 rs2413887 T>C 7.70(0.44) 4.9 × 10−62 1 CTXN2 MYEF2


..................................................................................................................................................................................................................................................................................................................................................
15:48426484 rs1426654 G>A 7.69(0.44) 5.5 × 10−62 X X N 1 SLC24A5
..................................................................................................................................................................................................................................................................................................................................................
15:48392165 rs1834640 G>A .56(0.44) 3.2 × 10−61 X 1 SLC24A5
..................................................................................................................................................................................................................................................................................................................................................
15:48400199 rs2675345 G>A 7.62(0.44) 6.7 × 10−61 1 SLC24A5
..................................................................................................................................................................................................................................................................................................................................................
15:48460188 rs8028919 G>A −4.95(0.41) 5.0 × 10−32 1 MYEF2
..................................................................................................................................................................................................................................................................................................................................................
19:3545022 rs10424065 C>T 4.48(0.48) 5.1 × 10−20 X X Y 2 MFSD12 CACTIN, MFSD12
..................................................................................................................................................................................................................................................................................................................................................
19:3544892 rs56203814 C>T −4.38(0.50) 3.6 × 10−18 X Y 2 MFSD12 CACTIN, MFSD12
..................................................................................................................................................................................................................................................................................................................................................
−16

Downloaded from http://science.sciencemag.org/ on August 18, 2019


19:3566631 rs111317445 C>T 3.51(0.42) 1.7 × 10 N 3 HMG20B MFSD12
..................................................................................................................................................................................................................................................................................................................................................
19:3547955 rs10414812 C>T 4.38(0.53) 3.8 × 10−16 X Y 2 MFSD12 CACTIN, FZR1,
MFSD12
..................................................................................................................................................................................................................................................................................................................................................
19:3565599 rs112332856 T>C 3.52(0.43) 3.8 × 10−16 X X Y 3 HMG20B MFSD12
..................................................................................................................................................................................................................................................................................................................................................
19:3565253 rs6510760 G>A 3.54(0.45) 6.5 × 10−15 X X Y 3 MFSD12 MFSD12
..................................................................................................................................................................................................................................................................................................................................................
19:3545150 rs73527942 T>G −3.58(0.47) 4.8 × 10−14 X X Y 2 MFSD12 CACTIN, MFSD12
..................................................................................................................................................................................................................................................................................................................................................
19:3547685 rs142317543 C>T 6.99(0.92) 5.0 × 10−14 X X Y 2 MFSD12 CACTIN, FZR1,
MFSD12
..................................................................................................................................................................................................................................................................................................................................................
−9
19:3566513 rs7254463 C>T 2.90(0.50) 9.0 × 10 X N 3 HMG20B MFSD12
..................................................................................................................................................................................................................................................................................................................................................
19:3565357 rs7246261 C>T 2.71(0.47) 1.1 × 10−8 X X 3 HMG20B MFSD12
..................................................................................................................................................................................................................................................................................................................................................
−8
19:3565909 rs6510761 T>C 2.79(0.50) 2.2 × 10 X 3 HMG20B MFSD12
..................................................................................................................................................................................................................................................................................................................................................
11:61137147 rs7948623 A>T −2.94(0.44) 2.2 × 10−11 X X Y 4 TMEM138 TKFC, DDB1, TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61148456 rs397709980 G/GA −2.90(0.43) 2.4 × 10−11 X N 4 TMEM216
..................................................................................................................................................................................................................................................................................................................................................
11:61152630 rs4453253 C>T −2.85(0.43) 5.4 × 10−11 X X N 4 TMEM216 CYB561A3, TKFC,
DDB1, TMEM138,
TMEM216
..................................................................................................................................................................................................................................................................................................................................................
11:61153401 rs4939520 C>T −2.79(0.43) 1.4 × 10−10 X N 4 TMEM216 CYB561A3, TKFC,
DDB1, TMEM138,
TMEM216
..................................................................................................................................................................................................................................................................................................................................................
11:61142943 rs4939519 C>T −2.47(0.39) 2.8 × 10−10 X 4 TMEM138 TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61106525 rs2512809 C>T −2.93(0.47) 7.4 × 10−10 N 5 TKFC TKFC, DDB1, TMEM138
..................................................................................................................................................................................................................................................................................................................................................
−10
11:61046876 rs11230658 T>C 3.01(0.49) 9.4 × 10 5 VWCE
..................................................................................................................................................................................................................................................................................................................................................
11:61084180 rs12289370 G>A 2.99(0.49) 1.3 × 10−9 X 5 DDB1 TKFC, DDB1
..................................................................................................................................................................................................................................................................................................................................................
11:61144652 rs1377457 C>A −3.01(0.49) 1.5 × 10−9 X 5 TMEM138 CYB561A3, TKFC,
DDB1, TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61088140 rs7934735 G>T 2.98(0.49) 1.5 × 10−9 5 DDB1
..................................................................................................................................................................................................................................................................................................................................................
11:61141476 rs7394502 G>A −2.41(0.40) 1.6 × 10−9 5 TMEM138 TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61141164 rs10897155 C>T −2.41(0.40) 1.6 × 10−9 Y 5 TMEM138 TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61139869 rs11230678 G>A −2.41(0.40) 1.7 × 10−9 5 TMEM138 TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61115821 rs148172827 C/CATCAA −2.95(0.49) 1.8 × 10−9 X X Y 5 TKFC CYB561A3, TKFC,
DDB1, TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61144707 rs1377458 C>T −2.40(0.40) 2.1 × 10−9 X 5 TMEM138 CYB561A3, TKFC,
DDB1, TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61076372 rs11230664 C>T 2.95(0.49) 2.1 × 10−9 X N 5 DDB1 DDB1
..................................................................................................................................................................................................................................................................................................................................................
11:61122878 rs7951574 G>A 2.92(0.49) 2.8 × 10−9 5 CYB561A3
..................................................................................................................................................................................................................................................................................................................................................
11:61054892 rs1108769 A>C 2.82(0.47) 3.0 × 10−9 X 5 VWCE TKFC, DDB1
..................................................................................................................................................................................................................................................................................................................................................
11:61141259 rs57265008 T>C 2.34(0.39) 3.7 × 10−9 Y 4 TMEM138 TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61222635 rs3017597 G>A −2.77(0.47) 5.4 × 10−9 5 SDHAF2
..................................................................................................................................................................................................................................................................................................................................................
11:61075524 rs12275843 T>C 2.64(0.45) 5.5 × 10−9 5 DDB1 DDB1
..................................................................................................................................................................................................................................................................................................................................................
−9
11:61043773 rs73490303 G>C 2.67(0.46) 7.2 × 10 5 VWCE VWCE
..................................................................................................................................................................................................................................................................................................................................................
11:61018855 rs653173 A>G 2.68(0.46) 8.2 × 10−9 X 5 PGA5
..................................................................................................................................................................................................................................................................................................................................................
11:61063156 rs10897150 G>T 2.79(0.48) 8.8 × 10−9 X 5 VWCE TKFC, DDB1, VWCE
..................................................................................................................................................................................................................................................................................................................................................

continued on next page

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 9 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

DHS DHS Luciferase LD Nearest Chromatin


Location RSID Ancestral>Derived Beta(SE) P
melanocytes other activity block gene interactions
..................................................................................................................................................................................................................................................................................................................................................
11:61108974 rs2260655 G>A 2.63(0.46) 9.0 × 10−9 5 TKFC CYB561A3, TKFC,
DDB1, TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61152028 rs12791961 C>A 2.90(0.50) 9.7 × 10−9 X 5 TMEM216 CYB561A3, TKFC,
DDB1, TMEM138,
TMEM216
..................................................................................................................................................................................................................................................................................................................................................
11:61055014 . GACTA/G 2.61(0.45) 1.1 × 10−8 X 5 VWCE TKFC, DDB1
..................................................................................................................................................................................................................................................................................................................................................
11:61080557 rs7120594 T>C 2.58(0.45) 1.2 × 10−8 5 DDB1
..................................................................................................................................................................................................................................................................................................................................................
−8
11:61044470 rs9704187 G>C 2.58(0.45) 1.3 × 10 5 VWCE VWCE
..................................................................................................................................................................................................................................................................................................................................................
11:61106892 rs2513329 G>C −2.73(0.48) 1.6 × 10−8 N 5 TKFC CYB561A3, TKFC,
DDB1, TMEM138,
TMEM216
..................................................................................................................................................................................................................................................................................................................................................
11:61033525 rs2001746 T>A 2.56(0.45) 1.7 × 10−8 5 VWCE
..................................................................................................................................................................................................................................................................................................................................................
11:61112802 rs2305465 C>T 2.62(0.46) 1.8 × 10−8 X X 5 TKFC CYB561A3, TKFC,
DDB1, TMEM138
..................................................................................................................................................................................................................................................................................................................................................
11:61037389 . ATT/A 2.51(0.45) 3.5 × 10−8 5 VWCE
..................................................................................................................................................................................................................................................................................................................................................
−9
15:28514281 rs4932620 C>T −2.85(0.48) 3.2 × 10 6 HERC2
..................................................................................................................................................................................................................................................................................................................................................
15:28532639 rs1667393 C>T −2.82(0.48) 6.3 × 10−9 X N 6 HERC2
..................................................................................................................................................................................................................................................................................................................................................

Downloaded from http://science.sciencemag.org/ on August 18, 2019


−9
15:28535675 rs1635167 C>T −2.88(0.50) 8.9 × 10 6 HERC2
..................................................................................................................................................................................................................................................................................................................................................
15:28545148 rs2905952 A>G −3.16(0.55) 9.0 × 10−9 6 HERC2
..................................................................................................................................................................................................................................................................................................................................................
15:28396894 rs12915877 T>G −2.76(0.48) 1.1 × 10−8 X 6 HERC2
..................................................................................................................................................................................................................................................................................................................................................
15:28487069 rs4932618 G>A −2.69(0.47) 1.6 × 10−8 6 HERC2
..................................................................................................................................................................................................................................................................................................................................................
15:28235773 rs1800404 C>T 2.54(0.45) 1.6 × 10−8 X 7 OCA2
..................................................................................................................................................................................................................................................................................................................................................
15:28238158 rs1868333 G>A −2.53(0.45) 2.2 × 10−8 7 OCA2
..................................................................................................................................................................................................................................................................................................................................................
−8
15:28419497 . TA/T −3.73(0.67) 2.6 × 10 6 HERC2
..................................................................................................................................................................................................................................................................................................................................................
15:28238895 rs735066 A>G −2.50(0.45) 3.5 × 10−8 X 7 OCA2
..................................................................................................................................................................................................................................................................................................................................................

Alleles in this LD block associated with dark pig- SNPs. Considering each locus in turn and all sig- and Denisovan genome sequences, which di-
mentation correlate with increased OCA2 expres- nificantly associated variants (P < 5 × 10−8), the verged from modern human sequences 804 ka
sion. We did not observe associations between trait variation attributable to each locus is as fol- (64), contain the ancestral allele at all loci. These
the candidate causal variants in region 1 and lows: SLC24A5 (12.8%; SE, 3.5%), MFSD12 (4.5%; observations are consistent with the hypothesis
OCA2 expression despite a high minor allele SE, 2.1%), DDB1/TMEM138 (2.2%; SE, 1.5%), and that darker pigmentation is a derived trait that
frequency (34%). However, we observe a signif- OCA2/HERC2 (3.9%; SE, 2.9%). Thus, ~29% of the originated in the genus Homo within the past
icant association between a haplotype tagged additive heritability of skin pigmentation in Afri- ~2 million years (My) after human ancestors lost
by rs1800404 and alternative splicing resulting cans is due to variation at these four regions. This most of their protective body hair, although these
in inclusion/exclusion of exon 10 (linear regres- observation indicates that the genetic architecture ancestral hominins may have been moderately,
sion t test, P = 9.1 × 10−40). Exon 10 encodes the of skin pigmentation is simpler (that is, fewer rather than darkly, pigmented (65, 66). Moreover,
amino acids encompassing the third transmem- genes of stronger effect) than other complex it appears that both light and dark pigmentation
brane domain of OCA2 and is the location of sev- traits, such as height (62). In addition, most can- have continued to evolve over hominid history.
eral albinism-associated OCA2 mutations (60, 61), didate causal variants are in noncoding regions, Individuals from South Asia and Australo-
raising the possibility that the shorter transcript indicating the importance of regulatory variants Melanesia share variants associated with dark
encodes a nonfunctional channel. Comparing influencing skin pigmentation phenotypes. pigmentation at MFSD12, DDB1/TMEM138, OCA2,
splice junction usage across individuals, we es- and HERC2 that are identical by descent from
timate that each additional copy of the light Evolution of skin pigmentation in Africans. This raises the possibility that other
rs1800404 (T) allele reduces inclusion of exon modern humans phenotypes shared between Africans and some
10 by ~20% (95% CI, 17.9 to 21.5%; fig. S9). There- Skin pigmentation is highly variable within Af- South Asian and Australo-Melanesian popula-
fore, homozygotes for the light rs1800404 (T) rica. Populations such as the San from southern tions may also be due to genetic variants identical
allele are expected to produce ~60% func- Africa are the most lightly pigmented among by descent from African populations rather than
tional OCA2 protein (compared to individuals Africans, whereas the East African Nilo-Saharan convergent evolution (67). This observation is con-
with albinism who produce no functional OCA2 populations are the most darkly pigmented in sistent with a proposed southern migration route
protein). the world (Fig. 1). Most alleles associated with out of Africa ~80 ka (68). Alternatively, it is possible
light and dark pigmentation in our data set are that light and dark pigmentation alleles segregated
Skin pigmentation is a complex trait estimated to have originated before the origin in a single African source population (13, 69) and
To estimate the proportion of pigmentation var- of modern humans ~300 ka (27). In contrast to that alleles associated with dark pigmentation
iance explained by the top eight candidate SNPs the lack of variation at MC1R, which is under pur- were maintained outside of Africa only in the South
at SLC24A5, MFSD12, DDB1/TMEM138, and OCA2/ ifying selection in Africa (63), our results indicate Asian and Australo-Melanesian populations due
HERC2, we used a linear mixed model with two that both light and dark alleles at MFSD12, DDB1, to selection.
genetic random effect terms: one based on the OCA2, and HERC2 have been segregating in the By studying ethnically, genetically, and pheno-
genome-wide kinship matrix and the other based hominin lineage for hundreds of thousands of typically diverse Africans, we identify novel pig-
on the kinship matrix derived from the set of years (Fig. 4). Furthermore, the ancestral allele mentation loci that are not highly polymorphic
significant variants. About 28.9% (SE, 10.6%) of is associated with light pigmentation in about in European populations. The loci identified in this
the pigmentation variance is attributable to these half of the predicted causal SNPs; Neandertal study appear to affect multiple phenotypes. For

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 10 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

example, DDB1 influences pigmentation (43), cel- each region and phased them using SHAPEIT2 ered a single copy of each chromosome from each
lular response to the mutagenic effect of UVR (76, 77). The reference panel came from two of the 279 individuals from (13). We inferred re-
(40), and female fertility (42). Thus, some of the datasets: filtered variants from the 135 African combination breakpoints within a symmetric win-
pigmentation-associated variants identified here genomes and TGP (10). After phasing, imputa- dow surrounding each locus using the program
may be maintained because of pleiotropic effects tion was performed using Minimac3 (78). Im- kwarg (89) and identified the longest shared
on other aspects of human physiology. putation performed very well at most loci (R2 > haplotype between each pair of sequences in
It is important to note that genetic variants 0.91 with MAF ≥ 0.05) (table S3). which no recombination events occurred. We
that do not reach genome-wide significance in To identify SNPs associated with pigmenta- then computed the expected coalescence time
our study might also affect the pigmentation tion, GWAS was performed first on the Illumina between each pair of sequences, conditional on
phenotype. The 1000 most strongly associated Omni 5M SNP dataset, and independently with the observed number of mutations in the non-
SNPs exhibit enrichment for genes involved in imputed variants at candidate regions, using recombining region. Genealogies were constructed
pigmentation and melanocyte physiology in the linear mixed models implemented in EMMAX by applying the WPGMA hierarchical clustering
mouse phenotype database and in ion transport software (9). Age and sex were included as co- algorithm to the estimated pairwise coalescence
and pyrimidine metabolism in humans (table S8). variates, and we corrected for genetic related- times. Our estimator accounts for recombination
Future research in larger numbers of ethnically ness with an IBS kinship matrix. We used CAVIAR events and the population size history. However,
diverse Africans may reveal additional loci asso- to identify variants in the imputed dataset most simulation studies indicate that accounting for
ciated with skin pigmentation and will further likely to be causal (11). Ontology enrichment for time-varying population size has relatively little
shed light on the evolutionary history, and adap- genes near the top 1000 most strongly associated effect on our estimates when the size changes ac-
tive significance, of skin pigmentation in humans. variants from the 5M dataset was obtained using cording to previously inferred histories for human
the annotation tool GREAT (79). populations (70, 90). Because the true population
Materials and methods We estimated the contribution to the variation sizes and relationships among the populations we

Downloaded from http://science.sciencemag.org/ on August 18, 2019


Individuals in the study were sampled from in melanin index from the top candidate causal considered are complex and imprecisely known,
Ethiopia, Tanzania and Botswana. Written in- variants with a restricted maximum likelihood we assumed a constant population size of N = 104
formed consent was obtained from all partici- (REML) analysis implemented in the Genome- in our analyses. The robustness analysis presented
pants, and research/ethics approval and permits wide Complex Trait Analysis (GCTA) software in (70) describes how our time estimates would
were obtained from all relevant institutions (70). (80). The variance parameters for two genetic change under different demographic histories and
To measure skin pigmentation we used a DSM II relationship matrices (GRMs) are estimated: one selective pressures.
ColorMeter to quantify reflectance from the in- GRM is constructed (81) from genome-wide back- To identify candidate causal GWAS variants
ner under arm. Red reflectance values were con- ground variants with MAF > 0.01, and one GRM altering gene expression we visualized and inter-
verted to a standard melanin index score (70, 71). is constructed from the set of 8 top pigmentation- sected variants with chromHMM tracks (91),
DNA was extracted from whole blood using a associated variants. The contribution of each DNase I hypersensitivity peaks, H3K27ac signal
salting out procedure (PureGen). locus to the melanin index variation is esti- tracks for keratinocytes and melanocytes (30),
A total of 1570 samples were genotyped on the mated similarly, using all genome-wide signif- CTCF signal tracks from keratinocytes (92), and
Illumina Omni5M SNP array (5M dataset) that icant (P <5 × 10−8) variants within each locus to ChIP-seq signal tracks from MITF (48). Variants
includes ~4.5 million SNPs. Genotypes were clus- construct the pigmentation-associated GRM (table were intersected with chromatin annotations
tered and called in Genome Studio software. Var- S3). REML iterations are based on maximizing the using bedtools (93). Functional consequences of
iant positions are reported in hg19/37 coordinates. Average Information matrix. variants were also assessed using deepSEA (94)
The overall completion rate was 98.8%. Each in- To test for neutrality in the regions flanking and deltaSVM (95). The effect of genetic variants
dividual’s sex was verified based on X chromo- our top GWAS variants we calculated Tajima’s on transcription factor binding was predicted
some inbreeding coefficients. We used Beagle 4.0 D, FST, and extended haplotype homozygosity using the MEME suite (96) for all transcription
(72) to phase the Illumina 5M SNP array data using iHS (82–84). Tajima’s D was measured factors in the JASPAR 2016 CORE Vertebrate
merged with SNPs from the TGP dataset that along chromosomes 11 and 15 using 50-kb sliding motif set (97).
were filtered to exclude related individuals. windows. Due to a high recombination rate ob- To test for associations between gene expres-
High coverage (>30×) Illumina Sequencing served near the MFSD12 locus, we used 10-kb sion, genetic variation, and ancestry we used
was performed on a subset of the genotyped in- windows in that region (chromosome 19). Vcftools eQTL and allele-specific expression (ASE) analy-
dividuals (N = 135). Variants were called following was used to calculate both Tajima’s D and FST ses on transcriptomes and genotype data from
the approach described in (13). Adapter sequen- (85). To calculate extended haplotype homo- primary cultures of human melanocytes, isolated
ces were trimmed with trimadap. Reads were zygosity (iHS) we used Selscan (86). Unstan- from foreskin of 106 individuals of assorted an-
aligned using bwa mem to the human reference dardized iHS scores were normalized within cestries. All 106 individuals were genotyped on
sequence build 37 (hg19). After alignment we 100 kb bins according to the frequency of the Illumina OmniExpress arrays and genotypes were
marked duplicate reads prior to calling variants derived allele. We then identified the signals of subsequently imputed using the Michigan Impu-
with GATK HaplotypeCaller (73). To select high positive selection by calculating the proportion of tation Server (78) based on the TGP reference
quality variants we employed a two-set filtering SNPs with |iHS| > 2 in non-overlapping windows panel and using SHAPEIT for phasing (76). RNA
strategy. First, we used the GATK variant quality (83). To identify outlier windows we calculated sequencing was performed to a mean depth of
score recalibration to score variant sites. We used 5th and 95th percentiles. Population differen- 87 million reads per sample. STAR (98) was used
TGP, OMIM, and our curated genotypes from the tiation was assessed with Weir and Cockerham’s for aligning reads, and RSEM (99) was used to
Illumina Omni 5M SNP array as training data. fixation index FST (82) between each pair of pop- quantify the gene expression. Quantile normal-
After recalibration we discarded sites with the ulations. Outliers were identified using empirical ization was applied in all samples to get the final
lowest scores. In addition, we discarded sites in P values. RSEM value. To account for hidden factors driv-
low-complexity regions listed in (74) and dupli- Median joining haplotype networks (87) for ing expression variabilities, a set of covariates
cate regions identified with Delly (75). the Illumina Omni 5M SNP dataset were con- were further identified using the PEER method
We performed local imputation around each structed and visualized at genomic regions of (100) and applied to calculate the normalized ex-
of the regions showing significant associations interest using NETWORK (88). In addition, we pression matrices. Principal components analysis
with skin pigmentation from GWAS using the constructed genealogies of regions flanking can- was performed using genotypic data to capture
Illumina Omni 5M SNP dataset. We extracted ar- didate causal SNPs using a hierarchical cluster- population structure and ancestry using the
ray genotypes within 1 Mb (500 kb upstream and ing approach with sequence data from the Simons struct.pca module of GLU (https://github.com/
500 kb downstream) of top GWAS variants from Genome Diversity Project (13). Briefly, we consid- bioinformed/glu-genetics). Using the normalized

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 11 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

expression values and principal components, with renilla luciferase control vector (pRL-CMV) erate a 134 bp deletion resulting in a null allele
Pearson correlation between gene expression lev- in a dual luciferase assay. Relative luciferase activ- of Mfsd12. A mixture of Cas9 mRNA (TriLink
els and ancestry was calculated, and associations ity (firefly/renilla luminescence ratio) is presented BioTechnologies) and each of the two synthesized
between GWAS variant genotypes and gene ex- as fold change compared to cells transfected with gRNAs was used for pronuclear injection into
pression levels were evaluated using ordinary least the empty pGL4.23 vector. Data were analyzed C57BL/6J × FVB/N F1 hybrid zygotes. Mutation
squares regression. with a modified Kruskal-Wallis Rank Sum test carrying mice were viable and presented with gray
To identify associations between our GWAS and pairwise comparisons between groups were coat color distinct from littermates. Hairs were
candidate causal variants and expression of near- performed using the Conover method. P values plucked from postnatal day 18 mice and indi-
by genes (using the 106 melanocyte transcriptomes), were corrected for multiple comparisons with the vidual awl hairs were mounted in permount and
we first found all protein-coding genes with tran- Benjamini-Hochberg method using the R package imaged with a stereomicroscope (Zeiss SteREO
scription start site (TSS) within 1 Mb of the top PMCMR, and P values less than 0.05 were con- Discovery.V12) at the base of the sub apical yel-
GWAS variant for each locus and RSEM values sidered significant. low band where the switch from eumelanin to
greater than 0.5 in the primary melanocyte cultures. We characterized the function of MFSD12 in pheomelanin is visible.
Pearson correlation was used to measure the asso- vitro in immortalized melanocytes and in vivo in To characterize Mfsd12 in grizzled mice, Illu-
ciation between ancestry and gene expression. both zebrafish and mice. Immortalized melan- mina generated whole genome sequences of
For each locus, we tested whether any genes Ink4a melanocytes from C57BL/6 Ink4a-Arf1−/− grizzled, JIGR/DN (gr/gr) reads were mapped
with a transcription start site within 1 Mb of the mice were cultured as described (32). To deplete using bwa mem to GRCm38/mm10 (available at
top SNP had an eQTL amongst the set of pig- MFSD12, cells were infected with recombinant SRA Accession SRR5571237). Sequence variants
mentation QTLs using an additive linear model lentiviruses—generated by transient transfection between JIGR/DN gr/gr and C57BL/6J reference
with the first two principal components of ances- in HEK293T cells—to express Mfsd12-targeted genome within the gr/gr candidate region were
try as covariates. To identify significant variant- shRNAs or non-target controls. Cells resistant to identified using SnpEff (104). Validation of a 12-bp

Downloaded from http://science.sciencemag.org/ on August 18, 2019


gene associations we used a permutation approach puromycin (also encoded by the lentiviruses) were deletion within Mfsd12 was performed using sam-
for each locus independently (70). This was re- analyzed 5 to 7 days after infection. Knockdown ples from an independently maintained gr/gr
peated for each gene, and focal variants across all efficiency of Mfsd12 mRNA in cells expressing colony provided by the laboratory of Dr. Margit
genes were adjusted for multiple testing using Mfsd12-specific shRNAs relative to non target Burmeister.
the Bonferroni correction (101). shRNA was quantified by reverse transcription/
We also carried out allele-specific expression quantitative PCR (detected with SYBR Green;
REFERENCES AND NOTES
(ASE) analyses for each significant eQTL SNP. Sites Applied Biosystems) relative to Tubb4b (encoding
1. N. G. Jablonski, G. Chaplin, The evolution of human skin
with at least 30 mapped reads, <5% mapping bias, b-tubulin) as a reference gene. Melanin content coloration. J. Hum. Evol. 39, 57–106 (2000). doi: 10.1006/
and ENCODE 125 bp mappability score ≥1 were in cell lysates was determined by a spectropho- jhev.2000.0403; pmid: 10896812
retained for further analysis. For genes with a tometric assay as described (102), and melanin 2. M. S. Marks, M. C. Seabra, The melanosome: Membrane
heterozygous coding variant amongst the melano- coverage in intact cells was determined by bright dynamics in black and white. Nat. Rev. Mol. Cell Biol. 2,
738–748 (2001). doi: 10.1038/35096009; pmid: 11584301
cyte transcriptomes, allelic expression (AE) was field microscopy and analysis using the “Analyze 3. F. H. Moyer, Genetic variations in the fine structure and
computed as AE = |0.5 - NA/(NA+ NR)|, where Particles” plug-in in ImageJ (National Institutes ontogeny of mouse melanin granules. Am. Zool. 6, 43–66
NR is the number of reads carrying the reference of Health). To analyze MFSD12-HA localization, (1966). pmid: 5902512
allele and NA is the number of reads containing melan-Ink4a melanocytes were transiently trans- 4. J. P. Ebanks et al., Epidermal keratinocytes from light vs.
dark skin exhibit differential degradation of melanosomes.
the alternative allele. For each GWAS variant, dif- fected with MFSD12-HA expression plasmids J. Invest. Dermatol. 131, 1226–1233 (2011). doi: 10.1038/
ferences in gene AE between GWAS heterozygotes and analyzed 48 hours later by bright field and jid.2011.22; pmid: 21326292
and homozygotes was evaluated by Wilcoxon immunofluorescence microscopy as described 5. F. Liu et al., Genetics of skin color variation in Europeans:
rank-sum test. This was repeated for all possi- (103) using the TA99 monoclonal antibody to Genome-wide association studies with functional follow-up.
Hum. Genet. 134, 823–835 (2015). doi: 10.1007/s00439-015-
ble genes and GWAS variants and Bonferroni- TYRP1 (American Type Culture Collection) to 1559-0; pmid: 25963972
corrected P values less than 0.05 were considered detect melanosomes, rabbit anti-LAMP2A (Abcam) 6. S. Beleza et al., Genetic architecture of skin and eye color in
significant. For several genes, including HMG20B to detect lysosomes, and rat anti-HA (Roche) to an African-European admixed population. PLOs Genet. 9,
and DDB1, ASE could not be measured for some detect the transgene. Percent signal overlap in the e1003372 (2013). doi: 10.1371/journal.pgen.1003372;
pmid: 23555287
or all variants of interest because no individuals cell periphery was determined from background- 7. L. R. Lloyd-Jones et al., Inference on the genetic basis of eye
were heterozygous for both the test-variant and subtracted, thresholded, binary images using the and skin color in an admixed population via Bayesian linear
a coding variant. “Analyze Particles” plug-in in ImageJ. Statistical mixed models. Genetics 206, 1113–1126 (2017). doi: 10.1534/
For OCA2 we tested for an association between significance was determined using unpaired, two- genetics.116.193383; pmid: 28381588
8. S. A. Tishkoff et al., The genetic structure and history of
inclusion rates of exon 10, which contains our top tailed student’s t tests: P < 0.05: *, P < 0.01: **, P < Africans and African Americans. Science 324, 1035–1044
candidate causal variant in the region, rs1800404, 0.001: ***, P < 0.0001: ****. Details are provided (2009). doi: 10.1126/science.1172257; pmid: 19407144
and individual genotypes at rs1800404. For each in (70). 9. H. M. Kang et al., Variance component model to account for
melanocyte transcriptome, reads spanning the Zebrafish mutagenesis using CRISPR-Cas9 sample structure in genome-wide association studies. Nat.
Genet. 42, 348–354 (2010). doi: 10.1038/ng.548;
exon 9 to exon 10 and exon 9 to exon 11 junctions was performed to target mfsd12a (70). Compound pmid: 20208533
were extracted from the splice-junction files output heterozygous mutant fish for analysis were gen- 10. The 1000 Genomes Project Consortium, A global reference
by STAR. For each individual, a percent spliced erated from F1 incrosses of mutant founder fish. for human genetic variation. Nature 526, 68–74 (2015).
in (PSI) value was calculated. To estimate the effect For methylene blue staining, embryos were col- doi: 10.1038/nature15393; pmid: 26432245
11. F. Hormozdiari, E. Kostem, E. Y. Kang, B. Pasaniuc, E. Eskin,
of variation at rs1800404 on exon 10 inclusion, lected following fertilization and placed in zebra- Identifying causal variants at loci with multiple signals of
ordinary least squares regression was carried fish system water containing 0.0002% methylene association. Genetics 198, 497–508 (2014). doi: 10.1534/
out between PSI and dosage of the alternative blue and analyzed at 6 dpf. For GFP analysis, genetics.114.167908; pmid: 25104515
allele for rs1800404 across individuals. A two- embryos were injected with 25 pg Tg(aox5:PALM- 12. B. Vernot et al., Excavating Neandertal and Denisovan DNA
from the genomes of Melanesian individuals. Science 352,
sided t test was used to calculate a P value. GFP) and 80 pg tol2 mRNA and GFP expression 235–239 (2016). doi: 10.1126/science.aad9416; pmid: 26989198
To test the functional impact of a subset of was evaluated in mosaic injected fish at 5 dpf. 13. S. Mallick et al., The Simons Genome Diversity Project: 300
GWAS variants on gene expression, predicted Larvae were anesthetized in sub-lethal 1x tricane genomes from 142 diverse populations. Nature 538, 201–206
regulatory sequences containing variants were solution and placed in 100 ml of a low melt agarose (2016). doi: 10.1038/nature18964; pmid: 27654912
14. R. L. Lamason et al., SLC24A5, a putative cation exchanger,
cloned into a pGL4.23 firefly luciferase reporter solution (0.8%). affects pigmentation in zebrafish and humans. Science 310,
vector. Vectors were transfected into a WM88 In mice two targets for CRISPR-Cas9 cleavage 1782–1786 (2005). doi: 10.1126/science.1116238;
melanocytic melanoma cell line and co-transfected were selected within exon 2 of Mfsd12 to gen- pmid: 16357253

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 12 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

15. S. Beleza et al., The timing of pigmentation lightening in cells. J. Cell Biol. 152, 809–824 (2001). doi: 10.1083/ expression. Hum. Genet. 123, 177–187 (2008). doi: 10.1007/
Europeans. Mol. Biol. Evol. 30 24–35 (2013). doi: 10.1093/ jcb.152.4.809; pmid: 11266471 s00439-007-0460-x; pmid: 18172690
molbev/mss207; pmid: 22923467 39. G. Chu, E. Chang, Xeroderma pigmentosum group E cells 59. H. E. Seberg et al., TFAP2 paralogs regulate melanocyte
16. M. Jonnalagadda et al., Identifying signatures of positive lack a nuclear factor that binds to damaged DNA. differentiation in parallel with MITF. PLOS Genet. 13,
selection in pigmentation genes in two South Asian Science 242, 564–567 (1988). doi: 10.1126/ e1006636 (2017). pmid: 10.1371/journal.pgen.1006636;
populations. Am. J. Hum. Biol. 29, e23012 (2017). science.3175673; pmid: 3175673 pmid: 28249010
doi: 10.1002/ajhb.23012; pmid: 28439965 40. A. L. Kadekaro et al., Melanocortin 1 receptor genotype: An 60. W. S. Oetting, S. S. Garrett, M. Brott, R. A. King, P gene
17. C. Basu Mallick et al., The light skin allele of SLC24A5 in important determinant of the damage response of mutations associated with oculocutaneous albinism type II
South Asians and Europeans shares identity by descent. melanocytes to ultraviolet radiation. FASEB J. 24, (OCA2). Hum. Mutat. 25, 323 (2005). doi: 10.1002/
PLOS Genet. 9, e1003912 (2013). doi: 10.1371/journal. 3850–3860 (2010). doi: 10.1096/fj.10-158485; humu.9318; pmid: 15712365
pgen.1003912 pmid: 20519635 61. R. Kerr et al., Identification of P gene mutations in individuals
18. I. Mathieson et al., Genome-wide patterns of selection in 230 41. Y. Zhang et al., Arabidopsis DDB1-CUL4 ASSOCIATED with oculocutaneous albinism in sub-Saharan Africa. Hum.
ancient Eurasians. Nature 528, 499–503 (2015). FACTOR1 forms a nuclear E3 ubiquitin ligase with DDB1 and Mutat. 15, 166–172 (2000). doi: 10.1002/(SICI)1098-1004
doi: 10.1038/nature16152; pmid: 26595274 CUL4 that is involved in multiple plant developmental (200002)15:2<166::AID-HUMU5>3.0.CO;2-Z; pmid: 10649493
19. L. Pagani et al., Ethiopian genetic diversity reveals linguistic processes. Plant Cell 20, 1437–1455 (2008). doi: 10.1105/ 62. A. R. Wood et al., Defining the role of common variation in the
stratification and complex influences on the Ethiopian gene tpc.108.058891; pmid: 18552200 genomic and biological architecture of adult human height.
pool. Am. J. Hum. Genet. 91, 83–96 (2012). doi: 10.1016/ 42. C. Yu et al., CRL4 complex regulates mammalian oocyte Nat. Genet. 46, 1173–1186 (2014). doi: 10.1038/ng.3097;
j.ajhg.2012.05.015; pmid: 22726845 survival and reprogramming by activation of TET proteins. pmid: 25282103
20. F. Tekola-Ayele et al., Novel genomic signals of recent Science 342, 1518–1521 (2013). doi: 10.1126/ 63. R. M. Harding et al., Evidence for variable selective pressures
selection in an Ethiopian population. Eur. J. Hum. Genet. 23, science.1244587; pmid: 24357321 at MC1R. Am. J. Hum. Genet. 66, 1351–1361 (2000).
1085–1092 (2015). doi: 10.1038/ejhg.2014.233; 43. M. Lieberman, O. Segev, N. Gilboa, A. Lalazar, I. Levin, The doi: 10.1086/302863; pmid: 10733465
pmid: 25370040 tomato homolog of the gene encoding UV-damaged DNA 64. D. Reich et al., Genetic history of an archaic hominin group
21. C. M. Schlebusch et al., Genomic variation in seven Khoe-San binding protein 1 (DDB1) underlined as the gene that causes from Denisova Cave in Siberia. Nature 468, 1053–1060
groups reveals adaptation and complex African history. the high pigment-1 mutant phenotype. Theor. Appl. Genet. (2010). doi: 10.1038/nature09710; pmid: 21179161
Science 338, 374–379 (2012). doi: 10.1126/science.1227721; 108, 1574–1581 (2004). pmid: 14968305 65. N. G. Jablonski, G. Chaplin, The colours of humanity: The
pmid: 22997136 44. K.-i. Takata, H. Yoshida, M. Yamaguchi, K. Sakaguchi, evolution of pigmentation in the human lineage. Philos. Trans.

Downloaded from http://science.sciencemag.org/ on August 18, 2019


22. J. K. Pickrell et al., The genetic prehistory of southern Africa. Drosophila damaged DNA-binding protein 1 is an essential R. Soc. London B Biol. Sci. 372, 20160349 (2017).
Nat. Commun. 3, 1143 (2012). doi: 10.1038/ncomms2140; factor for development. Genetics 168, 855–865 (2004). doi: 10.1098/rstb.2016.0349; pmid: 28533464
pmid: 23072811 doi: 10.1534/genetics.103.025965; pmid: 15514059 66. A. R. Rogers, D. Iltis, S. Wooding, Genetic variation at
23. L. Pagani et al., Tracing the route of modern humans out of 45. J. H. Lee et al., Evolutionarily assembled cis-regulatory the MCIR locus and the time since loss of human body
Africa by using 225 human genome sequences from module at a human ciliopathy locus. Science 335, 966–969 hair. Curr. Anthropol. 45, 105–108 (2004). doi: 10.1086/
Ethiopians and Egyptians. Am. J. Hum. Genet. 96, 986–991 (2012). doi: 10.1126/science.1213506; pmid: 22282472 381006
(2015). doi: 10.1016/j.ajhg.2015.04.019; pmid: 26027499 46. G. Li et al., Extensive promoter-centered chromatin 67. G. H. Perry, N. J. Dominy, Evolution of the human pygmy
24. C. Ehret, An African Classical Age: Eastern and Southern interactions provide a topological basis for transcription phenotype. Trends Ecol. Evol. 24, 218–225 (2009).
Africa in World History, 1000 BC to AD 400 (Oxford, 1998). regulation. Cell 148, 84–98 (2012). doi: 10.1016/ doi: 10.1016/j.tree.2008.11.008; pmid: 19246118
25. M. G. Madej, S. Dang, N. Yan, H. R. Kaback, Evolutionary mix- j.cell.2011.12.014; pmid: 22265404 68. L. Pagani et al., Genomic analyses inform on migration
and-match with MFS transporters. Proc. Natl. Acad. Sci. U.S.A. 47. L. Teng, B. He, J. Wang, K. Tan, 4DGenome: A comprehensive events during the peopling of Eurasia. Nature 538,
110, 5870–5874 (2013). doi: 10.1073/pnas.1303538110; database of chromatin interactions. Bioinformatics 31, 238–242 (2016). doi: 10.1038/nature19792;
pmid: 23530251 2560–2564 (2015). doi: 10.1093/bioinformatics/btv158; pmid: 27654910
26. R. Yu et al., Transcriptome analysis reveals markers of pmid: 25788621 69. A.-S. Malaspinas et al., A genomic history of Aboriginal
aberrantly activated innate immunity in vitiligo lesional and 48. P. Laurette et al., Transcription factor MITF and remodeller Australia. Nature 538, 207–214 (2016). doi: 10.1038/
non-lesional skin. PLOS ONE 7, e51040 (2012). doi: 10.1371/ BRG1 define chromatin organisation at regulatory elements in nature18299; pmid: 27654914
journal.pone.0051040; pmid: 23251420 melanoma cells. eLife 4, e06857 (2015). doi: 10.7554/ 70. Materials and methods are available as supplementary materials.
27. D. Richter et al., The age of the hominin fossils from Jebel eLife.06857; pmid: 25803486 71. J. K. Wagner, C. Jovel, H. L. Norton, E. J. Parra, M. D. Shriver,
Irhoud, Morocco, and the origins of the Middle Stone Age. 49. M. H. Beltrame, M. A. Rubel, S. A. Tishkoff, Inferences of Comparing quantitative measures of erythema, pigmentation
Nature 546, 293–296 (2017). doi: 10.1038/nature22335; African evolutionary history from genomic data. Curr. Opin. and skin response using reflectometry. Pigment Cell Res. 15,
pmid: 28593967 Genet. Dev. 41, 159–166 (2016). doi: 10.1016/ 379–384 (2002). doi: 10.1034/j.1600-0749.2002.02042.x;
28. M. Przeworski, The signature of positive selection at j.gde.2016.10.002; pmid: 27810637 pmid: 12213095
randomly chosen loci. Genetics 160, 1179–1189 (2002). 50. M. Visser, M. Kayser, R.-J. Palstra, HERC2 rs12913832 72. S. R. Browning, B. L. Browning, Rapid and accurate haplotype
pmid: 11901132 modulates human pigmentation by attenuating chromatin- phasing and missing-data inference for whole-genome
29. V. Le Corre, A. Kremer, The genetic differentiation at loop formation between a long-range enhancer and the OCA2 association studies by use of localized haplotype clustering.
quantitative trait loci under local adaptation. Mol. Ecol. 21, promoter. Genome Res. 22, 446–455 (2012). doi: 10.1101/ Am. J. Hum. Genet. 81, 1084–1097 (2007). doi: 10.1086/
1548–1566 (2012). doi: 10.1111/j.1365-294X.2012.05479.x; gr.128652.111; pmid: 22234890 521987; pmid: 17924348
pmid: 22332667 51. M. Kayser et al., Three genome-wide association studies and 73. A. McKenna et al., The genome analysis toolkit: A MapReduce
30. Roadmap Epigenomics Consortium, Integrative analysis of 111 a linkage analysis identify HERC2 as a human iris color gene. framework for analyzing next-generation DNA sequencing
reference human epigenomes. Nature 518, 317–330 (2015). Am. J. Hum. Genet. 82, 411–423 (2008). doi: 10.1016/ data. Genome Res. 20, 1297–1303 (2010). doi: 10.1101/
doi: 10.1038/nature14248; pmid: 25693563 j.ajhg.2007.10.003; pmid: 18252221 gr.107524.110; pmid: 20644199
31. D. Hnisz et al., Super-enhancers in the control of cell identity 52. J. Han et al., A genome-wide association study identifies 74. H. Li, Toward better understanding of artifacts in variant
and disease. Cell 155, 934–947 (2013). doi: 10.1016/ novel alleles associated with hair color and skin pigmentation. calling from high-coverage samples. Bioinformatics 30,
j.cell.2013.09.053; pmid: 24119843 PLOS Genet. 4, e1000074 (2008). doi: 10.1371/journal. 2843–2851 (2014). doi: 10.1093/bioinformatics/btu356;
32. E. V. Sviderskaya et al., p16Ink4a in melanocyte senescence pgen.1000074; pmid: 18483556 pmid: 24974202
and differentiation. J. Natl. Cancer Inst. 94, 446–454 (2002). 53. N. W. Bellono, I. E. Escobar, A. J. Lefkovith, M. S. Marks, 75. T. Rausch et al., DELLY: Structural variant discovery by
doi: 10.1093/jnci/94.6.446; pmid: 11904317 E. Oancea, An intracellular anion channel critical for integrated paired-end and split-read analysis. Bioinformatics
33. K. Howe et al., The zebrafish reference genome sequence and pigmentation. eLife 3, e04543 (2014). doi: 10.7554/ 28, i333–i339 (2012). doi: 10.1093/bioinformatics/bts378;
its relationship to the human genome. Nature 496, 498–503 eLife.04543; pmid: 25513726 pmid: 22962449
(2013). doi: 10.1038/nature12111; pmid: 23594743 54. M. H. Brilliant, The mouse p (pink-eyed dilution) and human 76. J. O’Connell et al., A general approach for haplotype phasing
34. R. N. Kelsh et al., Zebrafish pigmentation mutations and the P genes, oculocutaneous albinism type 2 (OCA2), and across the full spectrum of relatedness. PLOS Genet. 10,
processes of neural crest development. Development 123, melanosomal pH. Pigment Cell Res. 14, 86–93 (2001). e1004234 (2014). doi: 10.1371/journal.pgen.1004234;
369–389 (1996). pmid: 9007256 doi: 10.1034/j.1600-0749.2001.140203.x; pmid: 11310796 pmid: 24743097
35. S. Le Guyader, S. Jesuthasan, Analysis of xanthophore and 55. N. Eriksson et al., Web-based, participant-driven studies yield 77. O. Delaneau, J. Marchini, J.-F. Zagury, A linear
pterinosome biogenesis in zebrafish using methylene blue novel genetic associations for common traits. PLOS Genet. 6, complexity phasing method for thousands of genomes.
and pteridine autofluorescence. Pigment Cell Res. 15, 27–31 e1000993 (2010). doi: 10.1371/journal.pgen.1000993; Nat. Methods 9, 179–181 (2011). doi: 10.1038/nmeth.1785;
(2002). doi: 10.1034/j.1600-0749.2002.00045.x; pmid: 20585627 pmid: 22138821
pmid: 11837453 56. H. L. Norton et al., Genetic evidence for the convergent 78. S. Das et al., Next-generation genotype imputation service
36. J. L. Bloom, D. S. Falconer, ‘Grizzled’, a mutant in linkage evolution of light skin in Europeans and East Asians. Mol. Biol. and methods. Nat. Genet. 48, 1284–1287 (2016).
group X of the mouse. Genet. Res. 7, 159–167 (1966). Evol. 24, 710–722 (2007). doi: 10.1093/molbev/msl203; doi: 10.1038/ng.3656; pmid: 27571263
doi: 10.1017/S0016672300009587 pmid: 17182896 79. C. Y. McLean et al., GREAT improves functional interpretation
37. T. Kobayashi et al., Modulation of melanogenic protein 57. A. Rafati et al., Association of rs12913832 in the HERC2 of cis-regulatory regions. Nat. Biotechnol. 28, 495–501
expression during the switch from eu- to pheomelanogenesis. gene affecting human iris color variation. ASJ 12, 9–16 (2010). doi: 10.1038/nbt.1630; pmid: 20436461
J. Cell Sci. 108, 2301–2309 (1995). pmid: 7673350 (2015). 80. J. Yang, S. H. Lee, M. E. Goddard, P. M. Visscher, GCTA: A
38. G. Raposo, D. Tenza, D. M. Murphy, J. F. Berson, M. S. Marks, 58. H. Eiberg et al., Blue eye color in humans may be caused by a tool for genome-wide complex trait analysis. Am. J. Hum.
Distinct protein sorting and localization to premelanosomes, perfectly associated founder mutation in a regulatory Genet. 88, 76–82 (2011). doi: 10.1016/j.ajhg.2010.11.011;
melanosomes, and lysosomes in pigmented melanocytic element located within the HERC2 gene inhibiting OCA2 pmid: 21167468

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 13 of 14


Erratum 14 December 2017. See erratum.
R ES E A RC H | R E S EA R C H A R T I C LE

81. J. Yang et al., Common SNPs explain a large proportion of 96. T. L. Bailey et al., MEME SUITE: Tools for motif discovery and for technical assistance; M. Burmeister at University of Michigan
the heritability for human height. Nat. Genet. 42, 565–569 searching. Nucleic Acids Res. 37, W202–W208 (2009). for grizzled mouse samples; L. Garrett at the Embryonic Stem Cell
(2010). doi: 10.1038/ng.608; pmid: 20562875 doi: 10.1093/nar/gkp335; pmid: 19458158 and Transgenic Mouse Core [National Human Genome Research
82. B. S. Weir, C. C. Cockerham, Estimating F-statistics for the 97. A. Mathelier et al., JASPAR 2016: A major expansion and Institute (NHGRI)]; R. Sood in the Zebrafish Core (NHGRI); and the
analysis of population structure. Evolution 38, 1358–1370 update of the open-access database of transcription factor African participants. We acknowledge the contribution of the staff
(1984). doi: 10.2307/2408641; pmid: 28563791 binding profiles. Nucleic Acids Res. 44, D110–D115 (2016). members of the Cancer Genomics Research Laboratory [National
83. B. F. Voight, S. Kudaravalli, X. Wen, J. K. Pritchard, A map of doi: 10.1093/nar/gkv1176; pmid: 26531826 Cancer Institute (NCI)], the NIH Intramural Sequencing Center, the
recent positive selection in the human genome. PLOS Biol. 4, e72 98. A. Dobin et al., STAR: Ultrafast universal RNA-seq aligner. NCI Center for Cancer Research Sequencing Facility, the Yale
(2006). doi: 10.1371/journal.pbio.0040072; pmid: 16494531 Bioinformatics 29, 15–21 (2013). doi: 10.1093/ University Skin SPORE Specimen Resource Core, and the
84. F. Tajima, Statistical method for testing the neutral mutation bioinformatics/bts635; pmid: 23104886 Botswana–University of Pennsylvania Partnership. This work
hypothesis by DNA polymorphism. Genetics 123, 585–595 99. B. Li, C. N. Dewey, RSEM: Accurate transcript quantification used computational resources of the NIH High-Performance
(1989). pmid: 2513255 from RNA-seq data with or without a reference genome. BMC Computing (HPC) Biowulf cluster. This research was funded by
85. P. Danecek et al., The variant call format and VCFtools. Bioinformatics 12, 323 (2011). doi: 10.1186/1471-2105-12-323; the following grants: NIH grants 1R01DK104339-0 and
Bioinformatics 27, 2156–2158 (2011). doi: 10.1093/ pmid: 21816040 1R01GM113657-01 and NSF grant BCS-1317217 to S.A.T.,
bioinformatics/btr330; pmid: 21653522 100. O. Stegle, L. Parts, R. Durbin, J. Winn, A Bayesian framework NIH grant R01 AR048155 from the National Institute of Arthritis
86. Z. A. Szpiech, R. D. Hernandez, selscan: An efficient to account for complex non-genetic factors in gene and Musculoskeletal and Skin Diseases (NIAMS) to M.S.M,
multithreaded program to perform EHH-based scans for expression levels greatly increases power in eQTL studies. NIH grant R01 AR066318 from NIAMS to E.O., NIH grants
positive selection. Mol. Biol. Evol. 31, 2824–2827 (2014). PLOS Comput. Biol. 6, e1000770 (2010). doi: 10.1371/journal. 5R24OD017870-04 and 1U54DK110805-01 to L.Z. and Y.Z., NIH
doi: 10.1093/molbev/msu211; pmid: 25015648 pcbi.1000770; pmid: 20463871 grant R01-GM094402 to Y.S.S., and NIH grant K12 GM081259
87. H. J. Bandelt, P. Forster, A. Röhl, Median-joining networks for 101. C. E. Bonferroni, Teoria statistica delle classi e calcolo delle from NIGMS to S.B. M.H.B. was partly supported by a “Science
inferring intraspecific phylogenies. Mol. Biol. Evol. 16, 37–48 probabilità (Pubblicazioni del R Istituto Superiore di Scienze Without Borders” fellowship from CNPq, Brazil. Y.S.S. is a Chan
(1999). doi: 10.1093/oxfordjournals.molbev.a026036; Economiche e Commerciali di Firenze, 1936), vol. 8. Zuckerberg Biohub investigator. This work was supported in part
pmid: 10331250 102. C. Delevoye et al., AP-1 and KIF13A coordinate endosomal by the Center of Excellence in Environmental Toxicology
88. Fluxus Engineering, www.fluxus-engineering.com. sorting and positioning during melanosome biogenesis. J. Cell (NIH P30-ES013508, T32-ES019851 to M.E.B.H.) (National
89. R. B. Lyngsø, Y. S. Song, J. Hein, Minimum recombination Biol. 187, 247–264 (2009). doi: 10.1083/jcb.200907122; Institute of Environmental Health Sciences), the Intramural
histories by branch and bound, in Algorithms in Bioinformatics, pmid: 19841138 Program of the NHGRI, and the Division of Cancer Epidemiology

Downloaded from http://science.sciencemag.org/ on August 18, 2019


R. Casadio, G. Myers, Eds. (Lecture Notes in Computer Science 103. P. A. Calvo, D. W. Frank, B. M. Bieler, J. F. Berson, and Genetics, NCI, NIH, federal funds from NCI under contract
Series, Springer, 2005), vol. 3692, pp. 239–250. M. S. Marks, A cytoplasmic sequence in human tyrosinase HHSN261200800001E. The content of this publication does not
90. J. Terhorst, J. A. Kamm, Y. S. Song, Robust and scalable defines a second class of di-leucine-based sorting signals necessarily reflect the views or policies of the Department of
inference of population history from hundreds of unphased for late endosomal and lysosomal delivery. J. Biol. Chem Health and Human Services, nor does the mention of trade
whole genomes. Nat. Genet. 49, 303–309 (2017). 274, 12780–12789 (1999). doi: 10.1074/jbc.274.18.12780; names, commercial products, or organizations imply
doi: 10.1038/ng.3748; pmid: 28024154 pmid: 19505943 endorsement by the U.S. government. L.Z. is a founder and
91. J. Ernst, M. Kellis, ChromHMM: Automating chromatin-state 104. P. Cingolani et al., A program for annotating and predicting the stockholder of Fate Therapeutics, Marauder Therapeutics, and
discovery and characterization. Nat. Methods 9, 215–216 effects of single nucleotide polymorphisms, SnpEff. Fly (Austin) Scholar Rock. Data are available at dbGAP accession number
(2012). doi: 10.1038/nmeth.1906; pmid: 22373907 6, 80–92 (2012). doi: 10.4161/fly.19695; pmid: 25722852 phs001396.v1.p1 and SRA BioProject PRJNA392485. In memory
92. L. H. Chadwick, The NIH Roadmap Epigenomics Program 105. A. Ruiz-Linares et al., Admixture in Latin America: Geographic of G. Lema and E. Kimaro, who made important contributions
data resource. Epigenomics 4, 317–324 (2012). doi: 10.2217/ structure, phenotypic diversity and self-perception of to this project.
epi.12.18; pmid: 22690667 ancestry based on 7,342 individuals. PLOS Genet. 10,
e1004572 (2014). doi: 10.1371/journal.pgen.1004572; SUPPLEMENTARY MATERIALS
93. A. R. Quinlan, I. M. Hall, BEDTools: A flexible suite of utilities
for comparing genomic features. Bioinformatics 26, 841–842 pmid: 21963610 www.sciencemag.org/content/358/6365/eaan8433/suppl/DC1
(2010). doi: 10.1093/bioinformatics/btq033; pmid: 20110278 Materials and Methods
94. J. Zhou, O. G. Troyanskaya, Predicting effects of noncoding ACKN OWLED GMEN TS Figs. S1 to S21
variants with deep learning–based sequence model. Nat. We thank J. Akey and R. McCoy for Melanesian genotype data; Tables S1 to S8
Methods 12, 931–934 (2015). doi: 10.1038/nmeth.3547; A. Clark, C. Brown, and Y. S. Park for critical review of the NISC Comparative Sequencing Program Collaborator List
pmid: 26301843 manuscript; members of the Tishkoff laboratory for helpful References (106–133)
95. D. Lee et al., A method to predict the impact of regulatory discussion; A. Weeraratna at Wistar Institute, Philadelphia, for 27 May 2017; accepted 3 October 2017
variants from DNA sequence. Nat. Genet. 47, 955–961 providing the WM88 melanocytic cell line; D. Parichy for the aox5: Published online 12 October 2017
(2015). doi: 10.1038/ng.3331; pmid: 26075791 palmGFP plasmid; G. Xu and R. Yang at University of Pennsylvania 10.1126/science.aan8433

Crawford et al., Science 358, eaan8433 (2017) 17 November 2017 14 of 14


Loci associated with skin pigmentation identified in African populations
Nicholas G. Crawford, Derek E. Kelly, Matthew E. B. Hansen, Marcia H. Beltrame, Shaohua Fan, Shanna L. Bowman, Ethan
Jewett, Alessia Ranciaro, Simon Thompson, Yancy Lo, Susanne P. Pfeifer, Jeffrey D. Jensen, Michael C. Campbell, William
Beggs, Farhad Hormozdiari, Sununguko Wata Mpoloka, Gaonyadiwe George Mokone, Thomas Nyambo, Dawit Wolde
Meskel, Gurja Belay, Jake Haut, NISC Comparative Sequencing Program, Harriet Rothschild, Leonard Zon, Yi Zhou, Michael
A. Kovacs, Mai Xu, Tongwu Zhang, Kevin Bishop, Jason Sinclair, Cecilia Rivas, Eugene Elliot, Jiyeon Choi, Shengchao A. Li,
Belynda Hicks, Shawn Burgess, Christian Abnet, Dawn E. Watkins-Chow, Elena Oceana, Yun S. Song, Eleazar Eskin, Kevin
M. Brown, Michael S. Marks, Stacie K. Loftus, William J. Pavan, Meredith Yeager, Stephen Chanock and Sarah A. Tishkoff

Science 358 (6365), eaan8433.


DOI: 10.1126/science.aan8433originally published online October 12, 2017

Downloaded from http://science.sciencemag.org/ on August 18, 2019


African genomics and skin color
Skin color varies among human populations and is thought to be under selection, with light skin maximizing
vitamin D production at higher latitudes and dark skin providing UV protection in equatorial zones. To identify the genes
that give rise to the palette of human skin tones, Crawford et al. applied genome-wide analyses across diverse African
populations (see the Perspective by Tang and Barsh). Genetic variants were identified with likely function in skin
phenotypes. Comparison to model organisms verified a conserved function of MFSD12 in pigmentation. A global genetic
panel was used to trace how alleles associated with skin color likely moved across the globe as humans migrated, both
within and out of Africa.
Science, this issue p. eaan8433; see also p. 867

ARTICLE TOOLS http://science.sciencemag.org/content/358/6365/eaan8433

SUPPLEMENTARY http://science.sciencemag.org/content/suppl/2017/10/11/science.aan8433.DC1
MATERIALS

RELATED file:/content
CONTENT
http://science.sciencemag.org/content/sci/358/6360/157.full
http://science.sciencemag.org/content/sci/358/6365/867.full
http://science.sciencemag.org/content/sci/358/6369/eaar7002.full
REFERENCES This article cites 127 articles, 26 of which you can access for free
http://science.sciencemag.org/content/358/6365/eaan8433#BIBL

PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions

Use of this article is subject to the Terms of Service

Science (print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for the Advancement of
Science, 1200 New York Avenue NW, Washington, DC 20005. 2017 © The Authors, some rights reserved; exclusive
licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. The title
Science is a registered trademark of AAAS.

You might also like