Professional Documents
Culture Documents
Mario Ventura, University of Bari Aldo Moro, Bari, Italy Online posting date: 27th January 2015
Genome mutations represent a source of variability narrow regions can be expanded, thus increasing the copy num-
on which selective pressure acts: negative changes ber of genes imbedded in them. In the latter case, the involved
are purged from the populations, whereas posi- sequences are generally named SDs.
tive and neutral changes may be fixed. Segmental SDs represent a ‘new’ genomic repetitive element class, com-
duplications (SDs) in the human genome trig- monly defined as highly identical duplicated DNA fragments
greater than 1 kb and distinguished from any other classic repet-
ger mutations such as structural rearrangements
itive element such as LINEs, by the absence of any recurrent
(duplications, deletions, inversions and transloca-
element in them. Most of them show intersperse, rather than tan-
tions), thus playing a crucial role in human disease dem organization. SDs in humans do not cluster but are rather
and genome evolution. Several human diseases enriched on regions such as pericentromeres and subtelomeres
(genomic disorders) are caused by non-allelic and few chromosomes, such as 7, 15, 16, 17 and 22, thus creating
homologous recombination (NAHR) between ‘hot spots’ of NAHR events (Girirajan et al., 2010).
highly similar SDs, as well as gene-containing SDs Since the discovery of SDs, a lot of data have been produced
were crucial to survival and adaptation of human on their involvement in genomic disorders and evolution. In par-
species during evolution. Moreover, both from a ticular, regions of, or enriched in, SDs, because of their high
pathological and an evolutionary point of view, identity and wide distribution in the genomes, are favourite sites
SDs represent critically important regions for acen- of copy number polymorphisms, genomic disorders and evolu-
tionary breakpoints. Rearrangements may create new genes with
tric fragment rescue during neocentromerization
different spatial and temporal expression, thus creating and dif-
process. In this light, disease and evolution can be
ferentiating gene pools in the cells or organisms. In this light, SDs
considered as ‘two sides of the same coin’, where are great substrates for genomic variability and represent excep-
the coin represents the SD-mediated chromosomal tional examples of ‘two sides of the same coin’ effect: disease
rearrangements. versus evolution. In this article, we focus on how SDs in human,
primate and mammalian genomes have exerted the important role
in genome plasticity (See also: Segmental Duplications and
Their Role in the Evolution of the Human Genome).
Recent studies have consistently coupled SD-mediated illnesses mainly depend on the role of genes contained in the
NAHR causing genomic structural rearrangements to com- deleted/duplicated interval. However, the parental origin of the
plex phenotypes associated to mental illnesses, such as autism CNV, as well as the presence of variations in the wild-type allele,
spectrum disorders (ASD), schizophrenia, mental retarda- has also been shown to play an important role. Preliminary
tion and intellectual disability (Beckmann et al., 2007) (see evidence has also been provided on the contribution of genes
Table 1). or expressed pseudogenes contained in the SDs (besides those
The genotype/phenotype correlation and variability in located in the single region between the SD blocks) to the clinical
penetrance and expression regarding the genetics of mental phenotype (Merla et al., 2010).
The knowledge on the basic features of the SDs mediating been found as putative susceptibility variants for other recurrent
NAHR events causing genomic disorders allowed the utiliza- genomic rearrangements, such as the deletions causing Angelman
tion of an inverse strategy to detect genomic regions involved and Sotos syndromes (Cusco et al., 2008).
in the aetiology of mental retardation: a map of potential ‘rear- NAHR between interchromosomal SDs (i.e. on non-
rangement hotspots’ in the human genome has been generated, homologous chromosomes) results in chromosomal translo-
highlighting regions of genomic instability. On the basis of cations (Stankiewicz and Lupski, 2002) whose stability depends
this map, several genomic abnormalities on 17q21.31, 1q21.1, on the orientation of the SDs and the chromosome arms involved.
15q13.1–13.3 and 15q24.1–q24.3 were detected using a cus- Only when SDs map in the same orientation and on the same
tomized array (Sharp et al., 2006). chromosome arms (i.e. p-arm of one chromosome versus the
p-arm of the other) or they show opposite orientation on different
Inversions and translocations chromosome arms (i.e. p-arm from one chromosome versus
q-arm of the other), the interchromosomal NAHR produces
Inversions generated by meiotic or mitotic intrachromatidic mis- stable, monocentric reciprocal translocation chromosomes. In
alignment between the inverted homologous SDs can be consid- contrast, SDs in opposite orientation on the same chromosome
ered a benign polymorphism because carriers are phenotypically arms or those in the same orientation on opposite chromosome
normal (Tam et al., 2008). However, polymorphic chromoso- arms are predicted to result in either unstable dicentric or acen-
mal microinversions have been found as putative susceptibil- tric chromosomes (see Figure 2) (Ou et al., 2011). It has been
ity or resistance variants for some of the previously mentioned demonstrated that two pairs of the many olfactory receptor
MMSs. Indeed, these structural events can predispose or pro- (OR) gene clusters located close to each other, on chromosomes
tect to further genomic rearrangements, enhancing or lowering 4p16 and 8p23, are involved in the origin of the t(4;8)(p16;p23)
the disease risk in carriers. Two extended haplotypes, designated translocation by mediating interchromosomal NAHR (Giglio
H1 and H2, have been identified on 17q21.31 hotspot region, et al., 2002). Likewise, from 10% to 20% of the translocations
with the H2 being inverted compared to the reference genome. between chromosomes 9 and 22 occurred between the 76 kb
The inverted haplotype is rare in Africans, almost absent in East interchromosomal SDs that are located at the centromere proxi-
Asians but is found at a frequency of 20% in Europeans. The 900 mal to ABL gene on chromosome 9 and at the centromere distal
kb inversion haplotype is the only one that carries SD in direct to BCR gene on chromosome 22, t(9;22)(q34;q11) creating the
orientation at both breakpoints of the 17q21.31 microdeletion BCR/ABL fusion gene that is the underlying aetiology of chronic
region and therefore has a tendency to undergo NAHR leading myeloid leukaemia (CML) (Albano et al., 2010). Recently, it
to the 17q21.31 microdeletion syndrome (MIM 610433) (Itsara has been provided further molecular evidence to support NAHR
et al., 2012). These results bear striking similarity to another between interchromosomal SDs as a potential major mechanism
region of the human genome: heterozygosity for a polymorphic for recurrent reciprocal translocations (Ou et al., 2011).
∼2 Mb chromosomal microinversion in 7q11.23 is thought to The understanding of the NAHR mechanism combined with
lead to abnormal meiotic pairing and therefore an increased sus- the availability of the human genome sequence (International
ceptibility to unequal recombination causing the William-Beuren Human Genome Sequencing Consortium, 2004) paved the way
syndrome deletion. Similarly, paracentric microinversions have to massive in silico analyses to predict hotspots for genomic
Stable reciprocal
Unstable reciprocal translocations
translocations
p q q p
p q p q
(a) (c)
q p q p
p q q p
(b) (d)
Figure 2 Outcomes of interchromosomal NAHR mediated by SDs. Non-homologous chromosomes are coloured differently with the centromeres shown
as black box. Blue and yellow boxes indicate highly identical segmental duplications and arrows indicate their orientation. Stable reciprocal translocations
are originated by NAHR between interchromosomal SDs located on the same chromosomal arms (i.e. q-arm to q-arm) directly orientated (a) or on different
chromosomal arms with inverted orientation (i.e. p-arm to q-arm) (b). Conversely, SDs located on the same chromosomal arms in inverted orientation (c)
or on different chromosomal arms directly orientation (d) would lead to unstable dicentric and acentric chromosomes, resulting in chromosome breakage
and loss, respectively.
instability that may be prone to recurrent translocations, identi- of SDs in mammals, breed-specific differences have been iden-
fying 1902 sequences that correspond to interchromosomal SDs tified in mammals such as in cow (Nelore breed) where excellent
of >5 kb in length and >94% DNA sequence identity. Inter- gene candidates (CATHL4, ULBPT7 and KRTAP9-2) embedded
estingly, some of the potential interchromosomal NAHR pairs in SDs have been reported for pathogen and parasite resistance
represent olfactory receptor gene repeats. Some of the predicted (Bickhart et al., 2012).
recurrent translocations, however, may be underrepresented as Massive sequencing at high coverage of 97 great ape individ-
derivative chromosomes with longer segments of imbalance are uals has been recently used to create a comprehensive assess-
more likely to be incompatible with life. High-resolution genome ment of fixed deletions and duplications between humans and
analyses of additional balanced and unbalanced translocations great apes. In particular, demographic effects have been hypoth-
will be required to further confirm the utility of this ‘recurrent esized as main contributors in larger gene-rich deletions in the
translocation map’ (Ou et al., 2011). chimpanzee lineage, one of them being responsible for the first
reported case of a Smith-Magenis-like syndrome phenotype in
chimpanzee (see Table 1) (Sudmant et al., 2013).
Segmental Duplications Many efforts have been focused on understanding the struc-
and Evolution ture and organization of human SDs and their formation. To
date, SDs in humans have resulted in organized patchworks
To date, sequencings of 12 primate genomes have been pub- of several regions arranged around “core” elements that usu-
lished including human (International Human Genome Sequenc- ally show greater EST and exon density when compared to
ing Consortium, 2001), chimpanzee (Chimpanzee Sequencing flanking SDs (see Figure 3). In several primates, the cores
and Analysis Consortium, 2005), gorilla (Scally et al., 2012), have been copied to other locations in the genome, thus cre-
orangutan (Locke et al., 2011), gibbon (Carbone et al., 2014), ating a totally different set of SDs all centred on the same
marmoset (Consortium, 2014) and macaque (Rhesus Macaque core duplication (Marques-Bonet and Eichler, 2009). In par-
Genome Sequencing and Analysis Consortium, 2007). Moreover, ticular, the genomic distribution of great ape SDs is highly
36 other mammalian genomes have been sequenced at various non-random with the presence of ancestral duplications being a
levels of coverage; most of them have been resolved from whole strong predictor of “new”, lineage-specific events. For example,
genome shotgun (WGS) approaches using different technologies 45% of human–chimpanzee shared duplications map within 5 kb
such as classical capillary methods and next-generation sequenc- of SDs shared among human–chimpanzee–orangutan, whereas
ings (NGS). 31% of human–chimpanzee–orangutan duplications map adja-
Despite extensive progress in genome sequencing, SDs are cent to human–chimpanzee–orangutan–macaque duplications.
complex parts of genomes to be assembled mostly due to their These observations emphasize that unique sequences flanking
repetitive nature, similarity and mosaic organization, thus rep- more ancient duplications have a much higher probability to
resenting a challenging task in finishing genome assemblies. duplicate and the duplication process itself is not random. This
For this reason, most of the research has been focused in phenomenon is named duplication shadowing (Cheng et al.,
finding new methods to discover and genotype SDs including 2005).
experimental approaches using hybridization-based microarrays,
single-molecule analyses and sequencing-based computational Neocentromeres and fusion genes
approaches. Noteworthy, different experimental methods and
computational analyses of NGS data sets applied to the same As previously reported for human genomic disorders in the
genome show low levels of overlap showing that there does not species evolution, SDs represent a source of structural novelty
exist a single approach to detect all the SDs at once (see Table 2) triggering CRs that play a crucial role in neocentromere forma-
(Alkan et al., 2011). tion and create new genes by gene shuffling.
Recent comparative works, using multiple genomic Comparative studies on chromosome 15 (low copy repeats,
approaches, have shown that human and great ape lineages LCR15) have highlighted specific chromosomal duplications as
are particularly enriched for interspersed duplications with a preferential sites of chromosomal breakage and rearrangement
suggested burst occurring in the common ancestor of the humans and of recurrent evolutionary events of expansion and local dupli-
and African great apes, in contrast to the hominid slowdown of cation. Noteworthy, the same duplications that resulted trigger
single base-pair mutations (Marques-Bonet et al., 2009). Excep- not only the CR through NAHR but also SD accumulation at
tionally, in gorilla, events of duplicative transpositions created specific loci in response to a “feedback mechanism” to the chro-
complex pattern of SDs unique to this lineage and syntenic mosomal breakage. After the evolutionary breakage, additional
neither to human nor to chimpanzee (Ventura et al., 2011). DNA material transposed from other loci to recover the damage
When compared to other sequenced primates such as marmoset caused by the breakage, thus creating local duplications (Gian-
and gibbon and to mammalian genomes, human and great ape nuzzi et al., 2013a). Breakpoints of several evolutionary inver-
SDs tend to be more complex, more interspersed and bigger. sions and translocations have been recently associated to SDs.
In particular, in mammals, SDs are mostly organized in local Often, the presence of SDs at these rearrangement breakpoints
tandem duplication clusters as opposed to duplicative transpo- prevents them to be resolved at the base-pair level (Ventura et al.,
sitions to new locations, and the distribution is not homoge- 2011; Dennis et al., 2012).
neous, showing a preference for pericentromeric and subtelom- CRs mediated by SDs, such as interstitial deletions and inver-
eric regions (Clop et al., 2012). Despite the simpler structure sions, can result in the formation of a chromosomal fragment
Sequencing- Read-pair technologiesw • • • 2–7b High High Copy number prediction, High storage space and
based Read-depth methods • • high breakpoint resolution, computing capacity
Split-read approaches • • whole-genome analysis
and the use of stone tools. Overall, these data strongly support a References
role of these genes in human brain evolution (Dennis et al., 2012).
SDs may influence gene expression by moving regulatory Aigner J, Villatoro S, Rabionet R, et al. (2013) A common
elements, such as promoters, close to genes. An example is 56-kilobase deletion in a primate-specific segmental duplication
the LRRC37 gene family, where its expression evolved from a creates a novel butyrophilin-like protein. BMC Genetics 14: 61.
testis-specific to a more complex pattern because of the increase Albano F, Anelli L, Zagaria A, et al. (2010) Genomic segmental
in gene copy number and the juxtaposition of promoters (Gian- duplications on the basis of the t(9;22) rearrangement in chronic
nuzzi et al., 2013b). The acquisition of new promoters via SD myeloid leukemia. Oncogene 29: 2509–2516.
formation in the common ancestor of human and great apes was Alkan C, Coe BP and Eichler EE (2011) Genome structural variation
also responsible for the ‘resurrection’ of the IRGM gene other- discovery and genotyping. Nature Reviews. Genetics 12: 363–376.
wise not expressed due to an ALU retrotransposition disrupting Beckmann JS, Estivill X and Antonarakis SE (2007) Copy number
the open reading frame in the early stage of primate evolution variants and genetic traits: closer to the resolution of phenotypic to
(Bekpen et al., 2009). genotypic variability. Nature Reviews. Genetics 8: 639–646.
Bekpen C, Marques-Bonet T, Alkan C, et al. (2009) Death and res-
urrection of the human IRGM gene. PLoS Genetics 5: e1000403.
Bickhart DM, Hou Y, Schroeder SG, et al. (2012) Copy number vari-
Conclusions ation of individual cattle genomes using next-generation sequenc-
ing. Genome Research 22: 778–790.
Since their discovery, there has been growing interest on the struc- Brunetti-Pierri N, Berg JS, Scaglia F, et al. (2008) Recurrent recip-
ture, distribution and influence of SDs on human and mammalian rocal 1q21.1 deletions and duplications associated with micro-
genome plasticity, but many questions are still open. For example, cephaly or macrocephaly and developmental and behavioral abnor-
what is their role in mediating structural rearrangements responsi- malities. Nature Genetics 40: 1466–1471.
ble for genomic disorders? What are the structures and functions Carbone L, Alan Harris R, Gnerre S, et al. (2014) Gibbon genome
of the genes created by SD shuffling? and the fast karyotype evolution of small apes. Nature 513 (7517):
Although massive sequencing has helped in defining the loca- 195–201.
tion and distribution of SDs in the analyzed genomes, still SDs Cheng Z, Ventura M, She X, et al. (2005) A genome-wide comparison
are difficult to be assembled, thus remaining ambiguous regions of recent chimpanzee and human segmental duplications. Nature
of human and mammalian genomes that need systematic analy- 437: 88–93.
ses to be accurately decoded in terms of genotype, copy number Chimpanzee Sequencing and Analysis Consortium (2005) Initial
content and structure. sequence of the chimpanzee genome and comparison with the
Additional comparative high-quality sequences of SDs among human genome. Nature 437: 69–87.
primates and mammals will provide useful insights into their Clop A, Vidal O and Amills M (2012) Copy number variation in the
diffusion and diversification in different lineages and into the way genomes of domestic animals. Animal Genetics 43: 503–517.
selection shapes these regions of the genome, thus explaining the Consortium (2014) The common marmoset genome provides insight
dual effect (genomics disorders vs. evolution) that SDs have on into primate biology and evolution. Nature Genetics 46: 850–857.
the human genome. Cusco I, Corominas R, Bayes M, et al. (2008) Copy number variation
It is now clear that SDs represent an impressive source of at the 7q11.23 segmental duplications is a susceptibility factor for
genomic variation, essential from an evolutionary point of view. the Williams-Beuren syndrome deletion. Genome Research 18:
They balance negative selection of disease-causing microdele- 683–694.
tions and microduplications versus positive selection of newly Dennis MY, Nuttle X, Sudmant PH, et al. (2012) Evolution of
minted gene families embedded in core duplications and dis- human-specific neural SRGAP2 genes by incomplete segmental
tributed to new locations. Most of these genes, both in humans duplication. Cell 149: 912–922.
and in mammals, are involved in immunity and are critically Diskin SJ, Hou C, Glessner JT, et al. (2009) Copy number variation
important for individual/species survival. Despite their impor- at 1q21.1 associated with neuroblastoma. Nature 459: 987–991.
tance, very little is known about the role of SDs and CNVs in Ensenauer RE, Adeyinka A, Flynn HC, et al. (2003) Microdu-
the mechanisms for adaptation and diversification of responses plication 22q11.2, an emerging syndrome: clinical, cytogenetic,
for both host and pathogen. Gaining more insights on SDs is then and molecular analysis of thirteen patients. American Journal of
necessary to fully understand the genetics and biology of infec- Human Genetics 73: 1027–1040.
tious diseases pathogenesis. Francis NJ, McNicholas B, Awan A, et al. (2012) A novel hybrid
CFH/CFHR3 gene generated by a microhomology-mediated dele-
tion in familial atypical hemolytic uremic syndrome. Blood 119:
591–601.
Acknowledgments Giannuzzi G, Pazienza M, Huddleston J, et al. (2013a) Hominoid fis-
sion of chromosome 14/15 and the role of segmental duplications.
We thank Dr. Francesca Antonacci for valuable comments and Genome Research 23: 1763–1773.
help in the preparation of this manuscript. The authors declare no Giannuzzi G, Siswara P, Malig M, et al. (2013b) Evolutionary
conflicts of interest. Our work is supported by Futuro in Ricerca dynamism of the primate LRRC37 gene family. Genome Research
2010 (RBFR103CE3). 23: 46–59.
Giglio S, Calvari V, Gregato G, et al. (2002) Heterozygous submicro- Scally A, Dutheil JY, Hillier LW, et al. (2012) Insights into hominid
scopic inversions involving olfactory receptor-gene clusters medi- evolution from the gorilla genome sequence. Nature 483: 169–175.
ate the recurrent t(4;8)(p16;p23) translocation. American Journal Sharp AJ, Hansen S, Selzer RR, et al. (2006) Discovery of previously
of Human Genetics 71: 276–285. unidentified genomic disorders from the duplication architecture of
Girirajan S, Rosenfeld JA, Cooper GM, et al. (2010) A recurrent the human genome. Nature Genetics 38: 1038–1042.
16p12.1 microdeletion supports a two-hit model for severe devel- She X, Horvath JE, Jiang Z, et al. (2004) The structure and evolution
opmental delay. Nature Genetics 42: 203–209. of centromeric transition regions within the human genome. Nature
Gonzalez E, Kulkarni H, Bolivar H, et al. (2005) The influence of 430: 857–864.
CCL3L1 gene-containing segmental duplications on HIV-1/AIDS Stankiewicz P and Lupski JR (2002) Genome architecture, rearrange-
susceptibility. Science 307: 1434–1440. ments and genomic disorders. Trends in Genetics 18: 74–82.
International Human Genome Sequencing Consortium (2001) Ini- Stankiewicz P (2010) Structural variation in the human genome and
tial sequencing and analysis of the human genome. Nature 409: its role in disease. Annual Review of Medicine 61: 437–455.
860–921. Sudmant PH, Huddleston J, Catacchio CR, et al. (2013) Evolution
Itsara A, Vissers LE, Steinberg KM, et al. (2012) Resolving and diversity of copy number variation in the great ape lineage.
the breakpoints of the 17q21.31 microdeletion syndrome with Genome Research 23: 1373–1382.
next-generation sequencing. American Journal of Human Genetics Tam E, Young EJ, Morris CA, et al. (2008) The common inversion of
90: 599–613. the Williams-Beuren syndrome region at 7q11.23 does not cause
Kim PM, Lam HY, Urban AE, et al. (2008) Analysis of copy num- clinical symptoms. American Journal of Medical Genetics. Part A
ber variants and segmental duplications in the human genome: 146A: 1797–1806.
Evidence for a change in the process of formation in recent evo- Turner DJ, Miretti M, Rajan D, et al. (2008) Germline rates of de
lutionary history. Genome Research 18: 1865–1874. novo meiotic deletions and duplications causing several genomic
Locke DP, Hillier LW, Warren WC, et al. (2011) Comparative disorders. Nature Genetics 40: 90–95.
and demographic analysis of orang-utan genomes. Nature 469: Ventura M, Catacchio CR, Alkan C, et al. (2011) Gorilla genome
529–533. structural variation reveals evolutionary parallelisms with chim-
Marotta M, Chen X, Inoshita A, et al. (2012) A common panzee. Genome Research 21: 1640–1649.
copy-number breakpoint of ERBB2 amplification in breast cancer Ventura M, Mudge JM, Palumbo V, et al. (2003) Neocentromeres in
colocalizes with a complex block of segmental duplications. Breast 15q24-26 map to duplicons which flanked an ancestral centromere
Cancer Research 14: R150. in 15q25. Genome Research 13: 2059–2068.
Marques-Bonet T and Eichler EE (2009) The evolution of human Wain LV, Armour JA and Tobin MD (2009) Genomic copy number
segmental duplications and the core duplicon hypothesis. Cold variation, human health, and disease. Lancet 374: 340–350.
Spring Harbor Symposia on Quantitative Biology 74: 355–362. Weise A, Mrasek K, Klein E, et al. (2012) Microdeletion and
Marques-Bonet T, Kidd JM, Ventura M, et al. (2009) A burst of microduplication syndromes. The Journal of Histochemistry and
segmental duplications in the genome of the African great ape Cytochemistry 60: 346–358.
ancestor. Nature 457: 877–881. Zhou Y and Mishra B (2005) Quantifying the mechanisms for seg-
Marshall OJ, Chueh AC, Wong LH and Choo KH (2008) Neocen- mental duplications in mammalian genomes by statistical analysis
tromeres: new insights into centromere structure, disease devel- and modeling. Proceedings of the National Academy of Sciences
opment, and karyotype evolution. American Journal of Human of the United States of America 102: 4051–4056.
Genetics 82: 261–282.
Merla G, Brunetti-Pierri N, Micale L and Fusco C (2010) Copy
number variants at Williams-Beuren syndrome 7q11.23 region. Further Reading
Human Genetics 128: 3–26.
Mulle JG, Dodd AF, McGrath JA, et al. (2010) Microdeletions of Alkan C, Coe BP and Eichler EE (2011) Genome structural variation
3q29 confer high risk for schizophrenia. American Journal of discovery and genotyping. Nature Reviews. Genetics 12: 363–376.
Human Genetics 87: 229–236. Beckmann JS, Estivill X and Antonarakis SE (2007) Copy number
Ohno S, Wolf U and Atkin NB (1968) Evolution from fish to mam- variants and genetic traits: closer to the resolution of phenotypic to
mals by gene duplication. Hereditas 59: 169–187. genotypic variability. Nature Reviews. Genetics 8: 639–646.
Ou Z, Stankiewicz P, Xia Z, et al. (2011) Observation and prediction Stankiewicz P (2010) Structural variation in the human genome and
of recurrent human translocations mediated by NAHR between its role in disease. Annual Review of Medicine 61: 437–455.
nonhomologous chromosomes. Genome Research 21: 33–46.
Rhesus Macaque Genome Sequencing and Analysis Consortium
(2007) Evolutionary and biomedical insights from the rhesus
macaque genome. Science 316: 222–234.