You are on page 1of 7

Noncoding DNA

From Wikipedia, the free encyclopedia

Jump to: navigation, search

In genetics, noncoding DNA describes components of an organism's DNA sequences

that do not encode for protein sequences. In many eukaryotes, a large percentage of an
organism's total genome size is noncoding DNA, although the amount of noncoding
DNA, and the proportion of coding versus noncoding DNA varies greatly between

Much of this DNA has no known biological function and is sometimes referred to as
"junk DNA". However, many types of noncoding DNA sequences do have known
biological functions, including the transcriptional and translational regulation of protein-
coding sequences. Other noncoding sequences have likely but as-yet undetermined
function, an inference from high levels of homology and conservation seen in sequences
that do not encode proteins but appear to be under heavy selective pressure.


• 1 Fraction of noncoding genomic DNA

• 2 Junk DNA
• 3 Types of noncoding DNA sequences
o 3.1 Noncoding functional RNA
o 3.2 Cis-regulatory elements
o 3.3 Introns
o 3.4 Pseudogenes
o 3.5 Repeat sequences, transposons and viral elements
o 3.6 Telomeres
• 4 Functions of noncoding DNA
• 5 Noncoding DNA and evolution
• 6 See also
• 7 References

• 8 External links

[edit] Fraction of noncoding genomic DNA

The amount of total genomic DNA varies widely between organisms, and the proportion
of coding and noncoding DNA within these genomes varies greatly as well. More than
98% of the human genome does not encode protein sequences, including most sequences
within introns and most intergenic DNA.[1]

While overall genome size, and by extension the amount of noncoding DNA, are
correlated to organism complexity, there are many exceptions. For example, the genome
of the unicellular Polychaos dubium (formerly known as Amoeba dubia) has been
reported to contain more than 200 times the amount of DNA in humans.[2] The pufferfish
Takifugu rubripes genome is only about one eighth the size of the human genome, yet
seems to have a comparable number of genes; approximately 90% of the Takifugu
genome is noncoding DNA.[1] Most of the difference appears to lie in the junk DNA. The
extensive variation in nuclear genome size among eukaryotic species is known as the C-
value enigma or C-value paradox.[3]

About 80% of the nucleotide bases in the human genome may be transcribed,[4] but
transcription does not necessarily imply function.[5]

[edit] Junk DNA

Junk DNA, a term that was introduced in 1972 by Susumu Ohno,[6] is a provisional label
for the portions of a genome sequence of a for which no discernible function has been
identified. According to a 1980 review in Nature by Leslie Orgel and Francis Crick, junk
DNA has "little specificity and conveys little or no selective advantage to the organism".
The term is currently, however, a somewhat outdated concept, being used mainly in
popular science and in a colloquial way in scientific publications, and may have slowed
research into the biological functions of noncoding DNA.[8] Several lines of evidence
indicate that many "junk DNA" sequences have likely but unidentified functional
activity, and other sequences may have had functions in the past.[9]

Still, a large amount of sequence in these genomes falls under no existing classification
other than "junk". For example, one experiment removed 1% of the mouse genome with
no detectable effect on the phenotype.[10] This result suggests that the removed DNA was
largely nonfunctional. In addition, these sequences are enriched for the heterochromatic
histone modification H3K9me3 [11].

[edit] Types of noncoding DNA sequences

[edit] Noncoding functional RNA

Noncoding RNAs are functional RNA molecules that are not translated into protein.
Examples of noncoding RNA include ribosomal RNA, transfer RNA, Piwi-interacting
RNA and microRNA.

MicroRNAs are predicted to control the translational activity of approximately 30% of all
protein-coding genes in mammals and may be vital components in the progression or
treatment of various diseases including cancer, cardiovascular disease, and the immune
system response to infection.[12]

[edit] Cis-regulatory elements

Cis-regulatory elements are sequences that control the transcription of a gene. Cis-
elements may be located in 5' or 3' untranslated regions or within introns. Promoters
facilitate the transcription of a particular gene and are typically upstream of the coding

Enhancer sequences may exert very distant effects on the transcription levels of genes.[13]

[edit] Introns

Introns are non-coding sections of a gene, transcribed to precursor mRNA, but ultimately
removed by RNA splicing during the processing to mature messenger RNA. Many
introns appear to be mobile genetic elements.[14]

Studies of group I introns from Tetrahymena indicate that some introns appear to be
selfish genetic elements, neutral to the host because they remove themselves from
flanking exons during RNA processing and do not effect an expression bias between
alleles with and without the intron.[14] Some introns do appear to have significant
biological function, possibly through ribozyme functionality that may regulate tRNA and
rRNA activity as well as protein-coding gene expression, evident in hosts that have
become dependent on such introns over long periods of time; for example, the trnL-intron
is found in all green plants and appears to have been vertically inherited for several
billions of years, including more than a billion years within chloroplasts and an additional
2–3 billion prior in the cyanobacterial ancestors of chloroplasts.[14]

[edit] Pseudogenes

Pseudogenes are DNA sequences, related to known genes, that have lost their protein-
coding ability or are otherwise no longer expressed in the cell. Pseudogenes arise from
retrotransposition or genomic duplication of functional genes, and become "genomic
fossils" that are nonfuctional due to mutations that prevent the transcription of the gene,
such as within the gene promoter region, or fatally alter the translation of the gene, such
as premature stop codons or frameshifts.[15] Pseudogenes that are the of retrotransposition
of an RNA intermediate are known as processed pseudogenes; pseudogenes that arise
from the genomic remains of duplicated genes or residues of inactivated are
nonprocessed pseudogenes.[15]

While Dollo's Law suggests that the loss of function in pseudogenes is likely permanent,
silenced genes may actually retain function for several million years and can be
"reactivated" into protein-coding sequences[16] and a substantial number of pseudogenes
are actively transcribed.[15] Because pseudogenes are presumed to evolve without
evolutionary constraint, they can serve as a useful model of the type and frequencies
various spontaneous genetic mutations.[17]

[edit] Repeat sequences, transposons and viral elements

Transposons and retrotransposons are mobile genetic elements. Retrotransposon repeated

sequences, which include long interspersed nuclear elements (LINEs) and short
interspersed nuclear elements (SINEs), account for a large proportion of the genomic
sequences in many species. Alu sequences, classified as a short interspersed nuclear
element, are the most abundant mobile elements in the human genome. Some examples
have been found of SINEs exerting transcriptional control of some protein-encoding

Endogenous retrovirus sequences are the product of reverse transcription of retrovirus

genomes into the genomes of germ cells. Mutation within these retro-transcribed
sequences can inactivate the viral genome.

Approximately 8% of the human genome is made up of endogenous retrovirus sequences,

, and as much as 25% is recognizably formed of retrotransposons.[21] Genome size
variation in at least two kinds of plants is mostly the result of retrotransposon sequences.

[edit] Telomeres

Telomeres are regions of repetitive DNA at the end of a chromosome, which provide
protection from chromosomal deterioration during DNA replication.

[edit] Functions of noncoding DNA

Many noncoding DNA sequences have very important biological functions. Comparative
genomics reveals that some regions of noncoding DNA are highly conserved, sometimes
on time-scales representing hundreds of millions of years, implying that these noncoding
regions are under strong evolutionary pressure and positive selection.[23] For example, in
the genomes of humans and mice, which diverged from a common ancestor 65–75
million years ago, protein-coding DNA sequences account for only about 20% of
conserved DNA, with the remaining majority of conserved DNA represented in
noncoding regions.[24] Linkage mapping often identifies chromosomal regions associated
with a disease with no evidence of functional coding variants of genes within the region,
suggesting that disease-causing genetic variants lie in the noncoding DNA.[24]

Some noncoding DNA sequences are genetic "switches" that do not encode proteins, but
do regulate when and where genes are expressed.[25] According to a comparative study of
over 300 prokaryotic and over 30 eukaryotic genomes,[26] eukaryotes appear to require a
minimum amount of non-coding DNA. This minimum amount can be predicted using a
growth model for regulatory genetic networks, implying that it is required for regulatory
purposes. In humans the predicted minimum is about 5% of the total genome.
Some specific sequences of noncoding DNA may be features essential to chromosome
structure, centromere function and homolog recognition in meiosis.[27]

Some noncoding DNA sequences determine how much of a particular protein gets

Other sequences of noncoding DNA determine where transcription factors attach.[28]

[edit] Noncoding DNA and evolution

Shared sequences of apparently non-functional DNA are a major line of evidence for
common descent.[29]

Pseudogene sequences appear to accumulate mutations more rapidly than coding

sequences due to a loss of selective pressure.[17] This allows for the creation of mutant
alleles that incorporate new functions that may be favored by natural selection; thus,
pseudogenes can serve as raw material for evolution and can be considered "protogenes".

[edit] See also

• Eukaryotic chromosome fine structure
• Phylogenetic footprinting

[edit] References
1. ^ a b Elgar G, Vavouri T (July 2008). "Tuning in to the signals: noncoding
sequence conservation in vertebrate genomes". Trends Genet. 24 (7): 344–52.
doi:10.1016/j.tig.2008.04.005. PMID 18514361.
2. ^ Gregory TR, Hebert PD (April 1999). "The modulation of DNA content:
proximate causes and ultimate consequences". Genome Res. 9 (4): 317–24.
PMID 10207154.
3. ^ Wahls, W.P., et al. (1990). "Hypervariable minisatellite DNA is a hotspot for
homologous recombination in human cells". Cell 60 (1): 95–103. PMID 2295091.
4. ^ Pennisi, Elizabeth (2007). "DNA Study Forces Rethink of What It Means to Be
a Gene". Science 316 (5831): 1556–7. doi:10.1126/science.316.5831.1556.
PMID 17569836.
5. ^ Struhl, Kevin (2007). "Transcriptional noise and the fidelity of initiation by
RNA polymerase II". Nature Structural & Molecular Biology 14: 103–105.
doi:10.1038/nsmb0207-103. PMID 17277804.
6. ^ So much "junk" DNA in our genome, In Evolution of Genetic Systems (1972).
H. H. Smith. ed. Gordon and Breach, New York. pp. 366–370.
7. ^ Orgel LE, Crick FH (April 1980). "Selfish DNA: the ultimate parasite". Nature
284 (5757): 604–7. doi:10.1038/284604a0. PMID 7366731.
8. ^ Khajavinia A, Makalowski W (May 2007). "What is "junk" DNA, and what is it
worth?". Scientific American 296 (5): 104. PMID 17503549.
"The term "junk DNA" repelled mainstream researchers from studying noncoding
genetic material for many years".
9. ^ Biémont, Christian (2006). "Genetics: Junk DNA as an evolutionary force".
Nature 443: 521. doi:10.1038/443521a.
10. ^ M.A. Nobrega, Y. Zhu, I. Plajzer-Frick, V. Afzal and E.M. Rubin (2004).
"Megabase deletions of gene deserts result in viable mice". Nature 431 (7011):
988–993. doi:10.1038/nature03022.
11. ^ Rosenfeld, Jeffrey A; Wang, Zhibin; Zhao, Keji; DeSalle, Rob; Zhang, Michael
Q (31 March 2009). "Determination of enriched histone modifications in non-
genic portions of the human genome.". BMC Genomics 10 (143).
doi:10.1186/1471-2164-10-143. PMID 19335899.
12. ^ Li M, Marin-Muller C, Bharadwaj U, Chow KH, Yao Q, Chen C (April 2009).
"MicroRNAs: control and loss of control in human physiology and disease".
World J Surg 33 (4): 667–84. doi:10.1007/s00268-008-9836-x. PMID 19030926.
13. ^ Visel A, Rubin EM, Pennacchio LA (September 2009). "Genomic views of
distant-acting enhancers". Nature 461 (7261): 199–205. doi:10.1038/nature08451.
PMID 19741700.
14. ^ a b c Nielsen H, Johansen SD (2009). "Group I introns: Moving in new
directions". RNA Biol 6 (4): 375–83. doi:10.4161/rna.6.4.9334. PMID 19667762.
15. ^ a b c Zheng D, Frankish A, Baertsch R, et al. (June 2007). "Pseudogenes in the
ENCODE regions: consensus annotation, analysis of transcription, and
evolution". Genome Res. 17 (6): 839–51. doi:10.1101/gr.5586307.
PMID 17568002.
16. ^ Marshall CR, Raff EC, Raff RA (December 1994). "Dollo's law and the death
and resurrection of genes". Proc. Natl. Acad. Sci. U.S.A. 91 (25): 12283–7.
PMID 7991619. PMC 45421.
17. ^ a b Petrov DA, Hartl DL (2000). "Pseudogene evolution and natural selection for
a compact genome". J. Hered. 91 (3): 221–7. PMID 10833048.
18. ^ Ponicsan SL, Kugel JF, Goodrich JA (February 2010). "Genomic gems: SINE
RNAs regulate mRNA production". Curr Opin Genet Dev.
doi:10.1016/j.gde.2010.01.004. PMID 20176473.
19. ^ Häsler J, Samuelsson T, Strub K (July 2007). "Useful 'junk': Alu RNAs in the
human transcriptome". Cell. Mol. Life Sci. 64 (14): 1793–800.
doi:10.1007/s00018-007-7084-0. PMID 17514354.
20. ^ S. Blaise , N. de Parseval and T. Heidmann (2005). "Functional characterization
of two newly identified Human Endogenous Retrovirus coding envelope genes".
Retrovirology 2 (19). doi:10.1186/1742-4690-2-19.
21. ^ P.L. Deininger, M.A. Batzer (October 2002). "Mammalian retroelements".
Genome Res. 12 (10): 1455–1465. PMID 12368238.
22. ^ [1] [2]
23. ^ Ludwig MZ (December 2002). "Functional evolution of noncoding DNA".
Curr. Opin. Genet. Dev. 12 (6): 634–9. PMID 12433575.
24. ^ a b Cobb J, Büsst C, Petrou S, Harrap S, Ellis J (April 2008). "Searching for
functional genetic variants in non-coding DNA". Clin. Exp. Pharmacol. Physiol.
35 (4): 372–5. doi:10.1111/j.1440-1681.2008.04880.x. PMID 18307723.
25. ^ Carroll, Sean B., et al. (May 2008). "Regulating Evolution". Scientific
American 298 (5): 60–67.
26. ^ S. E. Ahnert, T. M. A. Fink and A. Zinovyev (2008). "How much non-coding
DNA do eukaryotes require?" (PDF). J Theor. Biol. 252: 587–592.
27. ^ Subirana JA, Messeguer X (March 2010). "The most frequent short sequences
in non-coding DNA". Nucleic Acids Res. 38 (4): 1172–81.
doi:10.1093/nar/gkp1094. PMID 19966278.
28. ^ a b Callaway, Ewen (March 2010). "Junk DNA gets credit for making us who we
are". New Scientist.
29. ^ "Plagiarized Errors and Molecular Genetics", talkorigins, by Edward E. Max,
M.D., Ph.D.
30. ^ Balakirev ES, Ayala FJ (2003). "Pseudogenes: are they "junk" or functional
DNA?". Annu. Rev. Genet. 37: 123–51.
doi:10.1146/annurev.genet.37.040103.103949. PMID 14616058.

• Bennett, M.D. and I.J. Leitch (2005). "Genome size evolution in plants". in T.R.
Gregory (ed.). The Evolution of the Genome. San Diego: Elsevier. pp. 89–162.
• Gregory, T.R (2005). "Genome size evolution in animals". in T.R. Gregory (ed.).
The Evolution of the Genome. San Diego: Elsevier.
• Shabalina SA, Spiridonov NA (2004). "The mammalian transcriptome and the
function of non-coding DNA sequences". Genome Biol. 5 (4): 105.
doi:10.1186/gb-2004-5-4-105. PMID 15059247.
• Castillo-Davis CI (October 2005). "The evolution of noncoding DNA: how much
junk, how much func?". Trends Genet. 21 (10): 533–6.
doi:10.1016/j.tig.2005.08.001. PMID 16098630.

Related Interests