You are on page 1of 9

Segmental Duplications: Advanced article

A Source of Diversity, Article Contents


• Introduction

Evolution and Disease • Segmental Duplications and Copy Number


Variations
• Impact of Segmental Duplications on the Human
Claudia R Catacchio, University of Bari Aldo Moro, Bari, Italy Genome
• Segmental Duplications and Evolution
Giorgia Chiatante, University of Bari Aldo Moro, Bari, Italy • Conclusions
Fabio Anaclerio, University of Bari Aldo Moro, Bari, Italy • Acknowledgments

Mario Ventura, University of Bari Aldo Moro, Bari, Italy Online posting date: 27th January 2015

Genome mutations represent a source of variability narrow regions can be expanded, thus increasing the copy num-
on which selective pressure acts: negative changes ber of genes imbedded in them. In the latter case, the involved
are purged from the populations, whereas posi- sequences are generally named SDs.
tive and neutral changes may be fixed. Segmental SDs represent a ‘new’ genomic repetitive element class, com-
duplications (SDs) in the human genome trig- monly defined as highly identical duplicated DNA fragments
greater than 1 kb and distinguished from any other classic repet-
ger mutations such as structural rearrangements
itive element such as LINEs, by the absence of any recurrent
(duplications, deletions, inversions and transloca-
element in them. Most of them show intersperse, rather than tan-
tions), thus playing a crucial role in human disease dem organization. SDs in humans do not cluster but are rather
and genome evolution. Several human diseases enriched on regions such as pericentromeres and subtelomeres
(genomic disorders) are caused by non-allelic and few chromosomes, such as 7, 15, 16, 17 and 22, thus creating
homologous recombination (NAHR) between ‘hot spots’ of NAHR events (Girirajan et al., 2010).
highly similar SDs, as well as gene-containing SDs Since the discovery of SDs, a lot of data have been produced
were crucial to survival and adaptation of human on their involvement in genomic disorders and evolution. In par-
species during evolution. Moreover, both from a ticular, regions of, or enriched in, SDs, because of their high
pathological and an evolutionary point of view, identity and wide distribution in the genomes, are favourite sites
SDs represent critically important regions for acen- of copy number polymorphisms, genomic disorders and evolu-
tionary breakpoints. Rearrangements may create new genes with
tric fragment rescue during neocentromerization
different spatial and temporal expression, thus creating and dif-
process. In this light, disease and evolution can be
ferentiating gene pools in the cells or organisms. In this light, SDs
considered as ‘two sides of the same coin’, where are great substrates for genomic variability and represent excep-
the coin represents the SD-mediated chromosomal tional examples of ‘two sides of the same coin’ effect: disease
rearrangements. versus evolution. In this article, we focus on how SDs in human,
primate and mammalian genomes have exerted the important role
in genome plasticity (See also: Segmental Duplications and
Their Role in the Evolution of the Human Genome).

Introduction Segmental Duplications and Copy


Genome duplication (GD) during evolution has played a criti- Number Variations
cal role in increasing genome size and complexity. Since GD
discovery, 40 years ago, our knowledge and understanding of The high similarity shared between SDs creates ideal conditions
the evolution of genes and genomes has hugely increased. Event for NAHR leading to chromosomal rearrangements (CRs). The
of duplications can involve full genome (Ohno et al., 1968) or outcome and impact of these NAHR events mainly depend on the
size, percentage of identity and orientation of the paralogous SDs
eLS subject area: Evolution & Diversity of Life and the recombination activity of the involved region. Indeed,
structural polymorphisms and recurrent pathogenic human rear-
How to cite:
Catacchio, Claudia R; Chiatante, Giorgia; Anaclerio, Fabio; and rangements are mostly caused by unequal crossover of paired,
Ventura, Mario (January 2015) Segmental Duplications: A ≥10 kb, ≥95% identical, 50 kb–10 Mb distant SDs (Stankiewicz,
Source of Diversity, Evolution and Disease. In: eLS. John Wiley & 2010).
Sons, Ltd: Chichester. Recently, duplicated genomic regions, roughly 12% of the
DOI: 10.1002/9780470015902.a0020838.pub2 human genome, have been reported as highly polymorphic in

eLS © 2015, John Wiley & Sons, Ltd. www.els.net 1


Segmental Duplications: A Source of Diversity, Evolution and Disease

Impact of Segmental Duplications


on the Human Genome
SDs can mediate unbalanced (microdeletions and microduplica-
tions) and balanced (translocations and inversions) genomic rear-
rangements that in turn can increase the susceptibility to human
disease or, at worst, directly engender a clinical phenotype (Wain
et al., 2009).
The role of SDs in the susceptibility to infectious and inflam-
matory diseases has been demonstrated, as a result of the fact
that regions of SDs are enriched in genes involved in immune
response. One of the first studies to provide precedents for a
NAHR link between SD distribution and variability in the phenotypic
response to disease showed that the variation of a CCL3L1
(a potent human immunodeficiency virus-1 (HIV-1)-suppressive
chemokine and ligand for the HIV co-receptor CCR5) copy
number is significantly associated with considerably enhanced
HIV/acquired immunodeficiency syndrome (AIDS) susceptibil-
ity. In particular, evidence of lower CCL3L1 copy number among
the HIV-positive individuals compared with HIV-negative sub-
jects suggested that people who carry a CCL3L1 copy number
Time lower than the population average have higher risk of HIV infec-
tion (Gonzalez et al., 2005).
Recent studies have provided evidence for the involvement of
specific CNVs also in cancer susceptibility. A common CNV
at chromosome 1q21.1 contributes to neuroblastoma suscepti-
bility, by altering the expression of the NBPF23 gene (Diskin
et al., 2009). Similarly, Marotta et al. (2012) hypothesized that a
large (300 kb), complex block of duplicated segments (sequence
Figure 1 Model representing the formation of copy number variants similarity >90%) was involved in the initiating mechanisms of
(CNVs, striped rectangles) and segmental duplications (SDs, solid colour ERBB2 amplification, found in breast cancers and associated with
rectangles). Regions harbouring highly identical SDs (green/blue rectangles) advanced stages, recurrence and poor patient survival. This sug-
undergo NAHR. If not negatively selected, the created microdeletions and
microduplications (rectangles with star) represent new CNVs and can be
gests that deletion polymorphisms of such repeated sequences
transmitted from generation to generation. The frequency of the duplicated might affect the initiation and thus the frequency of ERBB2
sequence in the population may increase until it becomes fixed, creating a amplification.
new SD (turquoise rectangle).

Microdeletions and microduplications


There are numerous examples of SD-mediated polymorphic CNV
formation but when the region changing copy number contains
dosage-sensitive or imprinted genes, CNVs are responsible for
copy number in the normal population [copy number vari- the genomic alterations underlying genetic syndromes (genomic
ants (CNVs)], thus increasing the complexity of the human disorders) (see Table 1).
As deletions and duplications caused by NAHR are recip-
genome (Zhou and Mishra, 2005). CNVs have been considered
rocal events, the assumption that their rates were similar has
as the unsettled, polymorphic form of SDs, and SDs correspond
been a common misconception for a long time. Nevertheless,
to CNVs that have been fixed in the human population (see
recently, it has been showed that at least in male meiosis, dele-
Figure 1) (Kim et al., 2008). However, at the moment, com-
tions occur approximately twice as frequently as duplications on
plete repertoires distinguishing between SDs and CNVs do not autosomes (Turner et al., 2008). Up to 2012, 211 microdeletion
exist and some duplications annotated as SDs may not be fixed versus 79 microduplication syndromes had been reported. Only
in the population, being rather common CNVs. For this reason, for 56 of these loci (21%), reciprocal/co-localising microdeletion
current efforts to sequence individual human genomes, such as and microduplication syndromes (MMSs) are described (Weise
the 1000 Genomes Project are focused on bringing greater cer- et al., 2012). Moreover, it is a general observation that microdu-
tainty about which SDs are fixed and which are polymorphic, plications result in a milder or no clinical phenotype compared
revealing significant differences on genomic disorders prevalence to the reciprocal microdeletions, as shown by duplications ver-
among different human populations (Mulle et al., 2010) (See sus deletions of 22q11.2 (Ensenauer et al., 2003) and 1q21.1
also: Segmental Duplications and Genetic Disease). (Brunetti-Pierri et al., 2008).

2 eLS © 2015, John Wiley & Sons, Ltd. www.els.net


Segmental Duplications: A Source of Diversity, Evolution and Disease

Table 1 Recurrent SD-mediated microdeletion/duplications


Locus Microdeletion-associated MIM code Locus Microduplication-associated MIM code
syndromes syndromes
1q21.1 Chromosome 1q21.1 deletion 612474 1q21.1 Chromosome 1q21.1 duplication 612475
syndrome, 1.35-mb syndrome
2q21.1 Attention deficit hyperactivity 612311 2q21.1 NI –
disorder, susceptibility to
Attention deficit hyperactivity 143465 NI –
disorder (ADHD)
3q29 3q29 deletion syndrome 609425 3q29 3q29 microduplication syndrome 611936
7q11.23 Williams-Beuren syndrome 194050 7q11.23 Williams-Beuren region 609757
(WBS) duplication syndrome
10q23.3 Prostate cancer 176807 10q23.3 NI –
15q11-q13 Angelman syndrome (if 105830 15q11-q13 15q11-q13 duplication syndrome 608636
maternally derived)
Prader-Willi syndrome (if 176270
paternally derived)
15q13.3 15q13.3 deletion syndrome 612001 15q13.3 NI –
Schizophrenia 13; sczd13 613025 15q13.3 NI –
15q24 15q24 deletion syndrome 613406 15q24 NI –
16p11.2 Chromosome 16p11.2 deletion 611913 16p11.2 Chromosome 16p11.2 614671
syndrome, 593 kb duplication syndrome
16p11.2-p12.2 16p11.2p12.2 deletion syndrome 613604 16p11.2p12.2 NI –
16p12.1 Chromosome 16p12.1 deletion 136570 16p12.1 NI –
syndrome, 520-KB
17p11.2 Smith-Magenis syndrome (SMS)a 182290 17p11.2 Potocki-Lupski syndrome 610883
(PTLS)
17p12 Hereditary neuropathy with 162500 17p12 Charcot–Marie–Tooth type 1A 604563
liability to pressure palsies (CMT1A)
syndrome (HNPP)
17q11.2 Neurofibromatosis type 1 613675 17q11.2 NI –
microdeletion syndrome
(NF1)
17q12 Chromosome 17q12 deletion 614527 17q12 Chromosome 17q12 duplication 614526
syndrome syndrome
17q21.31 Koolen-De Vries syndrome 610443 17q21.31 17q21.31 duplication syndrome 613533
22q11.2 DiGeorge syndrome (DGS) 188400 22q11.2 22q11.2 duplication syndrome 608363
Velocardiofacial syndrome 192430
(VCFS)
Chromosome 22q11.2 deletion 611867
syndrome, distal
Xp11.22-p11.23 NI – Xp11.23-p11.22 Chromosome Xp11.23-p11.22 300801
duplication syndrome
Xp22 X-linked ichthyosis 308100 Xp22 NI –
NI, not identified.
a The first evidence of a genomic disorder in a chimpanzee with features resembling to Smith-Magenis syndrome (see text for details).

Recent studies have consistently coupled SD-mediated illnesses mainly depend on the role of genes contained in the
NAHR causing genomic structural rearrangements to com- deleted/duplicated interval. However, the parental origin of the
plex phenotypes associated to mental illnesses, such as autism CNV, as well as the presence of variations in the wild-type allele,
spectrum disorders (ASD), schizophrenia, mental retarda- has also been shown to play an important role. Preliminary
tion and intellectual disability (Beckmann et al., 2007) (see evidence has also been provided on the contribution of genes
Table 1). or expressed pseudogenes contained in the SDs (besides those
The genotype/phenotype correlation and variability in located in the single region between the SD blocks) to the clinical
penetrance and expression regarding the genetics of mental phenotype (Merla et al., 2010).

eLS © 2015, John Wiley & Sons, Ltd. www.els.net 3


Segmental Duplications: A Source of Diversity, Evolution and Disease

The knowledge on the basic features of the SDs mediating been found as putative susceptibility variants for other recurrent
NAHR events causing genomic disorders allowed the utiliza- genomic rearrangements, such as the deletions causing Angelman
tion of an inverse strategy to detect genomic regions involved and Sotos syndromes (Cusco et al., 2008).
in the aetiology of mental retardation: a map of potential ‘rear- NAHR between interchromosomal SDs (i.e. on non-
rangement hotspots’ in the human genome has been generated, homologous chromosomes) results in chromosomal translo-
highlighting regions of genomic instability. On the basis of cations (Stankiewicz and Lupski, 2002) whose stability depends
this map, several genomic abnormalities on 17q21.31, 1q21.1, on the orientation of the SDs and the chromosome arms involved.
15q13.1–13.3 and 15q24.1–q24.3 were detected using a cus- Only when SDs map in the same orientation and on the same
tomized array (Sharp et al., 2006). chromosome arms (i.e. p-arm of one chromosome versus the
p-arm of the other) or they show opposite orientation on different
Inversions and translocations chromosome arms (i.e. p-arm from one chromosome versus
q-arm of the other), the interchromosomal NAHR produces
Inversions generated by meiotic or mitotic intrachromatidic mis- stable, monocentric reciprocal translocation chromosomes. In
alignment between the inverted homologous SDs can be consid- contrast, SDs in opposite orientation on the same chromosome
ered a benign polymorphism because carriers are phenotypically arms or those in the same orientation on opposite chromosome
normal (Tam et al., 2008). However, polymorphic chromoso- arms are predicted to result in either unstable dicentric or acen-
mal microinversions have been found as putative susceptibil- tric chromosomes (see Figure 2) (Ou et al., 2011). It has been
ity or resistance variants for some of the previously mentioned demonstrated that two pairs of the many olfactory receptor
MMSs. Indeed, these structural events can predispose or pro- (OR) gene clusters located close to each other, on chromosomes
tect to further genomic rearrangements, enhancing or lowering 4p16 and 8p23, are involved in the origin of the t(4;8)(p16;p23)
the disease risk in carriers. Two extended haplotypes, designated translocation by mediating interchromosomal NAHR (Giglio
H1 and H2, have been identified on 17q21.31 hotspot region, et al., 2002). Likewise, from 10% to 20% of the translocations
with the H2 being inverted compared to the reference genome. between chromosomes 9 and 22 occurred between the 76 kb
The inverted haplotype is rare in Africans, almost absent in East interchromosomal SDs that are located at the centromere proxi-
Asians but is found at a frequency of 20% in Europeans. The 900 mal to ABL gene on chromosome 9 and at the centromere distal
kb inversion haplotype is the only one that carries SD in direct to BCR gene on chromosome 22, t(9;22)(q34;q11) creating the
orientation at both breakpoints of the 17q21.31 microdeletion BCR/ABL fusion gene that is the underlying aetiology of chronic
region and therefore has a tendency to undergo NAHR leading myeloid leukaemia (CML) (Albano et al., 2010). Recently, it
to the 17q21.31 microdeletion syndrome (MIM 610433) (Itsara has been provided further molecular evidence to support NAHR
et al., 2012). These results bear striking similarity to another between interchromosomal SDs as a potential major mechanism
region of the human genome: heterozygosity for a polymorphic for recurrent reciprocal translocations (Ou et al., 2011).
∼2 Mb chromosomal microinversion in 7q11.23 is thought to The understanding of the NAHR mechanism combined with
lead to abnormal meiotic pairing and therefore an increased sus- the availability of the human genome sequence (International
ceptibility to unequal recombination causing the William-Beuren Human Genome Sequencing Consortium, 2004) paved the way
syndrome deletion. Similarly, paracentric microinversions have to massive in silico analyses to predict hotspots for genomic

Stable reciprocal
Unstable reciprocal translocations
translocations

p q q p

p q p q

(a) (c)

q p q p

p q q p

(b) (d)

Figure 2 Outcomes of interchromosomal NAHR mediated by SDs. Non-homologous chromosomes are coloured differently with the centromeres shown
as black box. Blue and yellow boxes indicate highly identical segmental duplications and arrows indicate their orientation. Stable reciprocal translocations
are originated by NAHR between interchromosomal SDs located on the same chromosomal arms (i.e. q-arm to q-arm) directly orientated (a) or on different
chromosomal arms with inverted orientation (i.e. p-arm to q-arm) (b). Conversely, SDs located on the same chromosomal arms in inverted orientation (c)
or on different chromosomal arms directly orientation (d) would lead to unstable dicentric and acentric chromosomes, resulting in chromosome breakage
and loss, respectively.

4 eLS © 2015, John Wiley & Sons, Ltd. www.els.net


Segmental Duplications: A Source of Diversity, Evolution and Disease

instability that may be prone to recurrent translocations, identi- of SDs in mammals, breed-specific differences have been iden-
fying 1902 sequences that correspond to interchromosomal SDs tified in mammals such as in cow (Nelore breed) where excellent
of >5 kb in length and >94% DNA sequence identity. Inter- gene candidates (CATHL4, ULBPT7 and KRTAP9-2) embedded
estingly, some of the potential interchromosomal NAHR pairs in SDs have been reported for pathogen and parasite resistance
represent olfactory receptor gene repeats. Some of the predicted (Bickhart et al., 2012).
recurrent translocations, however, may be underrepresented as Massive sequencing at high coverage of 97 great ape individ-
derivative chromosomes with longer segments of imbalance are uals has been recently used to create a comprehensive assess-
more likely to be incompatible with life. High-resolution genome ment of fixed deletions and duplications between humans and
analyses of additional balanced and unbalanced translocations great apes. In particular, demographic effects have been hypoth-
will be required to further confirm the utility of this ‘recurrent esized as main contributors in larger gene-rich deletions in the
translocation map’ (Ou et al., 2011). chimpanzee lineage, one of them being responsible for the first
reported case of a Smith-Magenis-like syndrome phenotype in
chimpanzee (see Table 1) (Sudmant et al., 2013).
Segmental Duplications Many efforts have been focused on understanding the struc-
and Evolution ture and organization of human SDs and their formation. To
date, SDs in humans have resulted in organized patchworks
To date, sequencings of 12 primate genomes have been pub- of several regions arranged around “core” elements that usu-
lished including human (International Human Genome Sequenc- ally show greater EST and exon density when compared to
ing Consortium, 2001), chimpanzee (Chimpanzee Sequencing flanking SDs (see Figure 3). In several primates, the cores
and Analysis Consortium, 2005), gorilla (Scally et al., 2012), have been copied to other locations in the genome, thus cre-
orangutan (Locke et al., 2011), gibbon (Carbone et al., 2014), ating a totally different set of SDs all centred on the same
marmoset (Consortium, 2014) and macaque (Rhesus Macaque core duplication (Marques-Bonet and Eichler, 2009). In par-
Genome Sequencing and Analysis Consortium, 2007). Moreover, ticular, the genomic distribution of great ape SDs is highly
36 other mammalian genomes have been sequenced at various non-random with the presence of ancestral duplications being a
levels of coverage; most of them have been resolved from whole strong predictor of “new”, lineage-specific events. For example,
genome shotgun (WGS) approaches using different technologies 45% of human–chimpanzee shared duplications map within 5 kb
such as classical capillary methods and next-generation sequenc- of SDs shared among human–chimpanzee–orangutan, whereas
ings (NGS). 31% of human–chimpanzee–orangutan duplications map adja-
Despite extensive progress in genome sequencing, SDs are cent to human–chimpanzee–orangutan–macaque duplications.
complex parts of genomes to be assembled mostly due to their These observations emphasize that unique sequences flanking
repetitive nature, similarity and mosaic organization, thus rep- more ancient duplications have a much higher probability to
resenting a challenging task in finishing genome assemblies. duplicate and the duplication process itself is not random. This
For this reason, most of the research has been focused in phenomenon is named duplication shadowing (Cheng et al.,
finding new methods to discover and genotype SDs including 2005).
experimental approaches using hybridization-based microarrays,
single-molecule analyses and sequencing-based computational Neocentromeres and fusion genes
approaches. Noteworthy, different experimental methods and
computational analyses of NGS data sets applied to the same As previously reported for human genomic disorders in the
genome show low levels of overlap showing that there does not species evolution, SDs represent a source of structural novelty
exist a single approach to detect all the SDs at once (see Table 2) triggering CRs that play a crucial role in neocentromere forma-
(Alkan et al., 2011). tion and create new genes by gene shuffling.
Recent comparative works, using multiple genomic Comparative studies on chromosome 15 (low copy repeats,
approaches, have shown that human and great ape lineages LCR15) have highlighted specific chromosomal duplications as
are particularly enriched for interspersed duplications with a preferential sites of chromosomal breakage and rearrangement
suggested burst occurring in the common ancestor of the humans and of recurrent evolutionary events of expansion and local dupli-
and African great apes, in contrast to the hominid slowdown of cation. Noteworthy, the same duplications that resulted trigger
single base-pair mutations (Marques-Bonet et al., 2009). Excep- not only the CR through NAHR but also SD accumulation at
tionally, in gorilla, events of duplicative transpositions created specific loci in response to a “feedback mechanism” to the chro-
complex pattern of SDs unique to this lineage and syntenic mosomal breakage. After the evolutionary breakage, additional
neither to human nor to chimpanzee (Ventura et al., 2011). DNA material transposed from other loci to recover the damage
When compared to other sequenced primates such as marmoset caused by the breakage, thus creating local duplications (Gian-
and gibbon and to mammalian genomes, human and great ape nuzzi et al., 2013a). Breakpoints of several evolutionary inver-
SDs tend to be more complex, more interspersed and bigger. sions and translocations have been recently associated to SDs.
In particular, in mammals, SDs are mostly organized in local Often, the presence of SDs at these rearrangement breakpoints
tandem duplication clusters as opposed to duplicative transpo- prevents them to be resolved at the base-pair level (Ventura et al.,
sitions to new locations, and the distribution is not homoge- 2011; Dennis et al., 2012).
neous, showing a preference for pericentromeric and subtelom- CRs mediated by SDs, such as interstitial deletions and inver-
eric regions (Clop et al., 2012). Despite the simpler structure sions, can result in the formation of a chromosomal fragment

eLS © 2015, John Wiley & Sons, Ltd. www.els.net 5


6
Table 2 Comparison of SD discovery methods.
Method Type Rearrangements Time Throughput Cost Advantages Limitations
(days)
Deletions Duplications Balanced
Microarray CGH array • • 2–3 High Low Complex phenotypes Lack of standardisation, no
SNP array • • analysis, multiple information on the
breakpoints definition localisation
FISH Interphase-nuclei FISH • • • 2–3 Low Medium Copy number prediction/ Low breakpoint resolution
Metaphases FISHa • • • validation, chromosomal
localization of the
rearrangementa
PCR-based MLPA • • <1 Medium Low Low amount of starting Small number of analysable
Segmental Duplications: A Source of Diversity, Evolution and Disease

MAPH • • material, high feasibility loci, sequence availability

Sequencing- Read-pair technologiesw • • • 2–7b High High Copy number prediction, High storage space and
based Read-depth methods • • high breakpoint resolution, computing capacity
Split-read approaches • • whole-genome analysis

eLS © 2015, John Wiley & Sons, Ltd. www.els.net


Sequence assembly • • •
a Type-specific advantages.
b Depending on the platform used (MiSeq, HiSeq or newer).
Segmental Duplications: A Source of Diversity, Evolution and Disease

six evolutionary new human centromeres (ENCs), on chromo-


somes 6, 11, 15 and 21, are among the highest in SD content (She
et al., 2004), underlining how SD enrichment in pericentromeric
regions is not only a temporal consequence of centromere posi-
Duplication tioning.
CD
event 1 HNs give the possibility to study the connection between
SDs and neocentromerization without the time-action that con-
trariwise, operating over millions of years from centromere
positioning to nowadays, have changed the organization of peri-
Time centromeric regions in ENCs. Since 1993, 15 HNs have been
described on human region 15q24–26, therefore considered a
hotspot of neocentromerization. Chromosomes 14 and 15 derived
from an ancestral chromosome after a fission event. After the fis-
sion on chromosome 15, the ancestral centromere was inactivated
and a new centromere emerged.
Duplication Comparative studies have mapped the ancestral centromeric
event 2
locus to 15q24–26, a chromosomal region enriched in SDs. The
evidence that this region still preserves its ability to act as a cen-
tromere, despite both the inactivation occurring at about 25 mya
followed by rapid loss of alphoid sequences, suggests that SDs
Time
play a crucial role in triggering centromeric positioning (Ventura
et al., 2003).
Besides the role in changing the structural organization of a
genome mediating CRs, SDs play an essential role in remodelling
transcriptionally active parts of the genome through both full-size
or partial duplication of genes.
Duplication
event 3 While full-size duplication of genes is the basis for gene family
creation, partial duplication or deletion creates the potential for
new chimeric genes (Francis et al., 2012).
Round of duplications were responsible for the creation of the
BTNL gene family whose members, located on human chromo-
somes 1, 5 and 6, are involved in immune response. On chro-
Figure 3 Core duplicon model for segmental duplication formation.
mosome 5, the close proximity of BTNL8 and BTNL3 genes is
Duplicative transposition (Duplication event 1) creates a copy of the ancestral the result of a tandem duplication. An NAHR event between
locus (core duplicon, CD, in green) to a new locus on the same chromo- these highly identical duplications is responsible of a 56 kb
some, thus generating an intrachromosomal duplication. A following event CNV deletion from BTNL8 and BTNL3 that creates the new
of duplicative transposition (Duplication event 2) involving the CD and flank- chimeric BTNL8*3 fusion gene. It has been demonstrated that
ing single sequences (empty orange squares) moves these regions to a new
locus on a non-homologous chromosome, thus creating interchromosomal
BTNL8*3del allele is responsible for the different regulation of
duplications. This step results in an increase of the segmental duplication BTNL9 and other genes involved in immune response and can-
size now composed by the CD (filled green square) plus flanking sequences cer. The different stratification of this CNV in the major ethnical
(filled orange squares). An additional round of duplication (Duplication event continental groups is the corroboration that CNVs and SDs play
3) involving further flanking single sequences may create bigger and more
a fundamental role in adaptive evolution, increasing variability in
complex intra- or interchromosomal segmental duplications. The new par-
alogous sequences share high similarity and can now promote non-allelic expression levels (Aigner et al., 2013).
homologous recombination. These events take place at multiple points dur- The link between gene duplication and evolution has been
ing the speciation process (before and after) generating both lineage-specific recently demonstrated by the human SRGAP2 gene. This gene is
and shared duplication blocks between closely related species. extremely conserved, with single copy in mammals and specifi-
cally duplicated in humans. Gene expression studies showed that
paralogs (SRGAP2B, C and D) show similar broad patterns of
lacking the centromere. If a new ectopic centromere (neocen- expression, including expression in the developing human foetal
tromere) devoid of alpha satellite is formed, this fragment can be brain concurrently with SRGAP2A (ancestral gene) and often
inherited as supernumerary marker chromosome during cell divi- display higher expression in multiple regions of the human cor-
sion. Several cases of human neocentromeres (HNs) have been tex and cerebellum when compared to other tissues including
reported (Marshall et al., 2008). lung, kidney and testis. In particular, it has been shown that
The enrichment in SDs described in pericentromeric regions SRGAP2C encodes a functional protein antagonist of SRGAP2A
and ancestral pericentromeric regions can either represent a tem- and is among the most fixed human-specific duplicate genes.
poral evolutionary consequence of centromere positioning or SDs Very interestingly, comparative-sequencing studies showed that
might precede the neocentromere formation and play an active the duplication occurred 2–3 mya at the transition from Australo-
role in the new centromere repositioning and function. Four of pithecus to Homo, at the same time of the neocortex expansion

eLS © 2015, John Wiley & Sons, Ltd. www.els.net 7


Segmental Duplications: A Source of Diversity, Evolution and Disease

and the use of stone tools. Overall, these data strongly support a References
role of these genes in human brain evolution (Dennis et al., 2012).
SDs may influence gene expression by moving regulatory Aigner J, Villatoro S, Rabionet R, et al. (2013) A common
elements, such as promoters, close to genes. An example is 56-kilobase deletion in a primate-specific segmental duplication
the LRRC37 gene family, where its expression evolved from a creates a novel butyrophilin-like protein. BMC Genetics 14: 61.
testis-specific to a more complex pattern because of the increase Albano F, Anelli L, Zagaria A, et al. (2010) Genomic segmental
in gene copy number and the juxtaposition of promoters (Gian- duplications on the basis of the t(9;22) rearrangement in chronic
nuzzi et al., 2013b). The acquisition of new promoters via SD myeloid leukemia. Oncogene 29: 2509–2516.
formation in the common ancestor of human and great apes was Alkan C, Coe BP and Eichler EE (2011) Genome structural variation
also responsible for the ‘resurrection’ of the IRGM gene other- discovery and genotyping. Nature Reviews. Genetics 12: 363–376.
wise not expressed due to an ALU retrotransposition disrupting Beckmann JS, Estivill X and Antonarakis SE (2007) Copy number
the open reading frame in the early stage of primate evolution variants and genetic traits: closer to the resolution of phenotypic to
(Bekpen et al., 2009). genotypic variability. Nature Reviews. Genetics 8: 639–646.
Bekpen C, Marques-Bonet T, Alkan C, et al. (2009) Death and res-
urrection of the human IRGM gene. PLoS Genetics 5: e1000403.
Bickhart DM, Hou Y, Schroeder SG, et al. (2012) Copy number vari-
Conclusions ation of individual cattle genomes using next-generation sequenc-
ing. Genome Research 22: 778–790.
Since their discovery, there has been growing interest on the struc- Brunetti-Pierri N, Berg JS, Scaglia F, et al. (2008) Recurrent recip-
ture, distribution and influence of SDs on human and mammalian rocal 1q21.1 deletions and duplications associated with micro-
genome plasticity, but many questions are still open. For example, cephaly or macrocephaly and developmental and behavioral abnor-
what is their role in mediating structural rearrangements responsi- malities. Nature Genetics 40: 1466–1471.
ble for genomic disorders? What are the structures and functions Carbone L, Alan Harris R, Gnerre S, et al. (2014) Gibbon genome
of the genes created by SD shuffling? and the fast karyotype evolution of small apes. Nature 513 (7517):
Although massive sequencing has helped in defining the loca- 195–201.
tion and distribution of SDs in the analyzed genomes, still SDs Cheng Z, Ventura M, She X, et al. (2005) A genome-wide comparison
are difficult to be assembled, thus remaining ambiguous regions of recent chimpanzee and human segmental duplications. Nature
of human and mammalian genomes that need systematic analy- 437: 88–93.
ses to be accurately decoded in terms of genotype, copy number Chimpanzee Sequencing and Analysis Consortium (2005) Initial
content and structure. sequence of the chimpanzee genome and comparison with the
Additional comparative high-quality sequences of SDs among human genome. Nature 437: 69–87.
primates and mammals will provide useful insights into their Clop A, Vidal O and Amills M (2012) Copy number variation in the
diffusion and diversification in different lineages and into the way genomes of domestic animals. Animal Genetics 43: 503–517.
selection shapes these regions of the genome, thus explaining the Consortium (2014) The common marmoset genome provides insight
dual effect (genomics disorders vs. evolution) that SDs have on into primate biology and evolution. Nature Genetics 46: 850–857.
the human genome. Cusco I, Corominas R, Bayes M, et al. (2008) Copy number variation
It is now clear that SDs represent an impressive source of at the 7q11.23 segmental duplications is a susceptibility factor for
genomic variation, essential from an evolutionary point of view. the Williams-Beuren syndrome deletion. Genome Research 18:
They balance negative selection of disease-causing microdele- 683–694.
tions and microduplications versus positive selection of newly Dennis MY, Nuttle X, Sudmant PH, et al. (2012) Evolution of
minted gene families embedded in core duplications and dis- human-specific neural SRGAP2 genes by incomplete segmental
tributed to new locations. Most of these genes, both in humans duplication. Cell 149: 912–922.
and in mammals, are involved in immunity and are critically Diskin SJ, Hou C, Glessner JT, et al. (2009) Copy number variation
important for individual/species survival. Despite their impor- at 1q21.1 associated with neuroblastoma. Nature 459: 987–991.
tance, very little is known about the role of SDs and CNVs in Ensenauer RE, Adeyinka A, Flynn HC, et al. (2003) Microdu-
the mechanisms for adaptation and diversification of responses plication 22q11.2, an emerging syndrome: clinical, cytogenetic,
for both host and pathogen. Gaining more insights on SDs is then and molecular analysis of thirteen patients. American Journal of
necessary to fully understand the genetics and biology of infec- Human Genetics 73: 1027–1040.
tious diseases pathogenesis. Francis NJ, McNicholas B, Awan A, et al. (2012) A novel hybrid
CFH/CFHR3 gene generated by a microhomology-mediated dele-
tion in familial atypical hemolytic uremic syndrome. Blood 119:
591–601.
Acknowledgments Giannuzzi G, Pazienza M, Huddleston J, et al. (2013a) Hominoid fis-
sion of chromosome 14/15 and the role of segmental duplications.
We thank Dr. Francesca Antonacci for valuable comments and Genome Research 23: 1763–1773.
help in the preparation of this manuscript. The authors declare no Giannuzzi G, Siswara P, Malig M, et al. (2013b) Evolutionary
conflicts of interest. Our work is supported by Futuro in Ricerca dynamism of the primate LRRC37 gene family. Genome Research
2010 (RBFR103CE3). 23: 46–59.

8 eLS © 2015, John Wiley & Sons, Ltd. www.els.net


Segmental Duplications: A Source of Diversity, Evolution and Disease

Giglio S, Calvari V, Gregato G, et al. (2002) Heterozygous submicro- Scally A, Dutheil JY, Hillier LW, et al. (2012) Insights into hominid
scopic inversions involving olfactory receptor-gene clusters medi- evolution from the gorilla genome sequence. Nature 483: 169–175.
ate the recurrent t(4;8)(p16;p23) translocation. American Journal Sharp AJ, Hansen S, Selzer RR, et al. (2006) Discovery of previously
of Human Genetics 71: 276–285. unidentified genomic disorders from the duplication architecture of
Girirajan S, Rosenfeld JA, Cooper GM, et al. (2010) A recurrent the human genome. Nature Genetics 38: 1038–1042.
16p12.1 microdeletion supports a two-hit model for severe devel- She X, Horvath JE, Jiang Z, et al. (2004) The structure and evolution
opmental delay. Nature Genetics 42: 203–209. of centromeric transition regions within the human genome. Nature
Gonzalez E, Kulkarni H, Bolivar H, et al. (2005) The influence of 430: 857–864.
CCL3L1 gene-containing segmental duplications on HIV-1/AIDS Stankiewicz P and Lupski JR (2002) Genome architecture, rearrange-
susceptibility. Science 307: 1434–1440. ments and genomic disorders. Trends in Genetics 18: 74–82.
International Human Genome Sequencing Consortium (2001) Ini- Stankiewicz P (2010) Structural variation in the human genome and
tial sequencing and analysis of the human genome. Nature 409: its role in disease. Annual Review of Medicine 61: 437–455.
860–921. Sudmant PH, Huddleston J, Catacchio CR, et al. (2013) Evolution
Itsara A, Vissers LE, Steinberg KM, et al. (2012) Resolving and diversity of copy number variation in the great ape lineage.
the breakpoints of the 17q21.31 microdeletion syndrome with Genome Research 23: 1373–1382.
next-generation sequencing. American Journal of Human Genetics Tam E, Young EJ, Morris CA, et al. (2008) The common inversion of
90: 599–613. the Williams-Beuren syndrome region at 7q11.23 does not cause
Kim PM, Lam HY, Urban AE, et al. (2008) Analysis of copy num- clinical symptoms. American Journal of Medical Genetics. Part A
ber variants and segmental duplications in the human genome: 146A: 1797–1806.
Evidence for a change in the process of formation in recent evo- Turner DJ, Miretti M, Rajan D, et al. (2008) Germline rates of de
lutionary history. Genome Research 18: 1865–1874. novo meiotic deletions and duplications causing several genomic
Locke DP, Hillier LW, Warren WC, et al. (2011) Comparative disorders. Nature Genetics 40: 90–95.
and demographic analysis of orang-utan genomes. Nature 469: Ventura M, Catacchio CR, Alkan C, et al. (2011) Gorilla genome
529–533. structural variation reveals evolutionary parallelisms with chim-
Marotta M, Chen X, Inoshita A, et al. (2012) A common panzee. Genome Research 21: 1640–1649.
copy-number breakpoint of ERBB2 amplification in breast cancer Ventura M, Mudge JM, Palumbo V, et al. (2003) Neocentromeres in
colocalizes with a complex block of segmental duplications. Breast 15q24-26 map to duplicons which flanked an ancestral centromere
Cancer Research 14: R150. in 15q25. Genome Research 13: 2059–2068.
Marques-Bonet T and Eichler EE (2009) The evolution of human Wain LV, Armour JA and Tobin MD (2009) Genomic copy number
segmental duplications and the core duplicon hypothesis. Cold variation, human health, and disease. Lancet 374: 340–350.
Spring Harbor Symposia on Quantitative Biology 74: 355–362. Weise A, Mrasek K, Klein E, et al. (2012) Microdeletion and
Marques-Bonet T, Kidd JM, Ventura M, et al. (2009) A burst of microduplication syndromes. The Journal of Histochemistry and
segmental duplications in the genome of the African great ape Cytochemistry 60: 346–358.
ancestor. Nature 457: 877–881. Zhou Y and Mishra B (2005) Quantifying the mechanisms for seg-
Marshall OJ, Chueh AC, Wong LH and Choo KH (2008) Neocen- mental duplications in mammalian genomes by statistical analysis
tromeres: new insights into centromere structure, disease devel- and modeling. Proceedings of the National Academy of Sciences
opment, and karyotype evolution. American Journal of Human of the United States of America 102: 4051–4056.
Genetics 82: 261–282.
Merla G, Brunetti-Pierri N, Micale L and Fusco C (2010) Copy
number variants at Williams-Beuren syndrome 7q11.23 region. Further Reading
Human Genetics 128: 3–26.
Mulle JG, Dodd AF, McGrath JA, et al. (2010) Microdeletions of Alkan C, Coe BP and Eichler EE (2011) Genome structural variation
3q29 confer high risk for schizophrenia. American Journal of discovery and genotyping. Nature Reviews. Genetics 12: 363–376.
Human Genetics 87: 229–236. Beckmann JS, Estivill X and Antonarakis SE (2007) Copy number
Ohno S, Wolf U and Atkin NB (1968) Evolution from fish to mam- variants and genetic traits: closer to the resolution of phenotypic to
mals by gene duplication. Hereditas 59: 169–187. genotypic variability. Nature Reviews. Genetics 8: 639–646.
Ou Z, Stankiewicz P, Xia Z, et al. (2011) Observation and prediction Stankiewicz P (2010) Structural variation in the human genome and
of recurrent human translocations mediated by NAHR between its role in disease. Annual Review of Medicine 61: 437–455.
nonhomologous chromosomes. Genome Research 21: 33–46.
Rhesus Macaque Genome Sequencing and Analysis Consortium
(2007) Evolutionary and biomedical insights from the rhesus
macaque genome. Science 316: 222–234.

eLS © 2015, John Wiley & Sons, Ltd. www.els.net 9

You might also like