You are on page 1of 4

Transposable elements and genome evolution

What is a transposable element?


A transposable element (TE, transposon, or jumping gene) is a DNA
sequence that can change its position within a genome, sometimes
creating or reversing mutations and altering the cell's genetic identity and
genome size. Transposable elements make up a large fraction of the
genome and are responsible for much of the mass of DNA in a eukaryotic
cell. Although TEs are selfish genetic elements, many are important in
genome function and evolution.

What are the classes of TEs?


Transposable elements represent one of several types of mobile genetic
elements. TEs are assigned to one of two classes according to their
mechanism of transposition, which can be described as either copy and
paste (Class I TEs) or cut and paste (Class II TEs).

What are the groups of Retrotransposons?


Retrotransposons are commonly grouped into three main orders:
 Retrotransposons, with long terminal repeats (LTRs), which encode
reverse transcriptase, similar to retroviruses
 Retroposons, long interspersed nuclear elements (LINEs, LINE-1s,
or L1s), which encode reverse transcriptase but lack LTRs, and are
transcribed by RNA polymerase II
 Short interspersed nuclear elements (SINEs) do not encode reverse
transcriptase and are transcribed by RNA polymerase III
(Retroviruses can also be considered TEs.

What are DNA transposons?


The cut-and-paste transposition mechanism of class II TEs does not
involve an RNA intermediate. The transpositions are catalyzed by several
transposase enzymes. Some transposases non-specifically bind to any
target site in DNA, whereas others bind to specific target sequences. The
transposase makes a staggered cut at the target site producing sticky
ends, cuts out the DNA transposon and ligates it into the target site.

Genomic location
New discoveries of transposable elements have shown the exact
distribution of TEs with respect to their transcription start sites (TSSs)
and enhancers. A recent study found that a promoter contains 25% of
regions that harbor TEs. It is known that older TEs are not found in TSS
locations because TEs frequency starts as a function once there is a
distance from the TSS. A possible theory for this is that TEs might
interfere with the transcription pausing or the first-intro splicing.

Negative effects of TEs


Transposons have coexisted with eukaryotes for thousands of years and
through their coexistence have become integrated in many organisms'
genomes. Colloquially known as 'jumping genes', transposons can move
within and between genomes allowing for this integration. While there are
many positive effects of transposons in their host eukaryotic genomes,
there are some instances of mutagenic effects that TEs have on genomes
leading to disease and malignant genetic alterations. A transposon or a
retrotransposon that inserts itself into a functional gene can disable that
gene.

How do TEs facilitate evolution?


TEs are found in almost all life forms, and the scientific community is still
exploring their evolution and their effect on genome evolution. While
some TEs confer benefits on their hosts, most are regarded as selfish DNA
parasites. In this way, they are similar to viruses. Various viruses and TEs
also share features in their genome structures and biochemical abilities,
leading to speculation that they share a common ancestor. Because
excessive TE activity can damage exons, many organisms have acquired
mechanisms to inhibit their activity. Bacteria may undergo high rates of
gene deletion as part of a mechanism to remove TEs and viruses from
their genomes, while eukaryotic organisms typically use RNA interference
to inhibit TE activity. Nevertheless, some TEs generate large families
often associated with speciation events. Evolution often deactivates DNA
transposons, leaving them as introns (inactive gene sequences). In
vertebrate animal cells, nearly all 100,000+ DNA transposons per genome
have genes that encode inactive transposase polypeptides. Many other
human genes are similarly derived from transposons. Transposons do not
always excise their elements precisely, sometimes removing the adjacent
base pairs; this phenomenon is called exon shuffling. Shuffling two
unrelated exons can create a novel gene product or, more likely, an
intron.

How can we de novo identify repeats?


De novo repeat identification is an initial scan of sequence data that seeks
to find the repetitive regions of the genome, and to classify these repeats.
Many computer programs exist to perform de novo repeat identification,
all operating under the same general principles. As short tandem repeats
are generally 1–6 base pairs in length and are often consecutive, their
identification is relatively simple. Dispersed repetitive elements, on the
other hand, are more challenging to identify, due to the fact that they are
longer and have often acquired mutations. However, it is important to
identify these repeats as they are often found to be transposable
elements (TEs). De novo identification of transposons involves three
steps: 1) find all repeats within the genome, 2) build a consensus of each
family of sequences, and 3) classify these repeats.
Genome architecture
Organisms show extreme variability in genome sizes. Gene content also
varies considerably between organisms. Several species have genomes
with an extremely high proportion of repetitive DNA, reaching 74% in
some species. Genomes are typically rich in noncoding DNA and display
an irregular architecture, with an uneven distribution of genes and
repetitive elements across and between chromosomes. Below are the
main identified genome architecture:

Gene clusters
A gene cluster is a group of two or more genes found within an
organism's DNA that encode similar polypeptides, or proteins, which
collectively share a generalized function and are often located within a
few thousand base pairs of each other. The size of gene clusters can vary
significantly, from a few genes to several hundred genes. In many
species, genes encoding metabolic enzymes often occur in coexpressed
clusters. Classical examples include gene clusters for secondary
metabolite pathways that mediate the synthesis of host-selective toxins.

Gene-sparse (aka gene desert) regions


Several organisms have compartmentalized genomes with a dis
continuous distribution of gene density owing to the occurrence of repeat-
rich, genesparse regions (also known as transposon islands). Gene sparse
regions constitute an estimated 25% of the entire genome, leading to the
recent interest in their true functions. Originally believed to contain
inessential and “junk” DNA due to their inability to create proteins, gene
deserts have since been linked to several vital regulatory functions,
including distal enhancing and conservatory inheritance. Thus, an
increasing number of risks that lead to several major diseases, including a
handful of cancers, have been attributed to irregularities found in gene
deserts.

Isochore-like regions
Some organisms have an unusual bipartite structure with alternating
blocks that differ sharply in GC content. A total of 216 ATrich blocks
ranging in length from 13 kb to 325 kb were identified; these areas
resemble the socalled isochore regions that have been described in
mammals and other vertebrates. Some of these isochorelike regions are
populated with trabsoisable elements (TEs) and are almost devoid of
coding sequences. Thus, it is hypothesized that isochorelike regions
emerged in genomes by a few families of TEs.

Subtelomeric regions
Organisms may have their genes located near telomeres, which are often
rich in TEs. Thus, genes located in telomeres or in subtelomeric regions
tend to evolve at higher rates than the rest of the genome.
Accessory chromosomes
Several organisms have accessory chromosomes (ACs) that differ from
the remainder of the genome in several structural features, such as the
number of repeats, gene density and GC content. ACs often carry genes
that carry essential functions, but they are accessory chromosomes
meaning that their loss does not affect organism’s viability. Repetitive
sequences such as TEs are ubiquitous in eukaryotes, although TE content
and distribution can remarkably differ between species. For example, the
TE content in fungi can differ between very streamlined genomes as found
in the yeast Saccharomyces cerevisiae (~3.3%) and genomes with a TE
content above 80% such as the ectomycorrhiza Cenococcum geophilum.
TE insertions tend to accumulate in ACS or accessory compartments in
core chromosomes. Thus, it is thought that ACs are fast evolving due to
TE insertions.

You might also like