You are on page 1of 11

STRUCTURE AND ORGANIZATION OF CHROMATIN

1. SHER-E-KASHMIR UNIVERSITY OF AGRICULTURAL SCIENCE AND TECHNOLOGY OF KASHMIR Division of


Biotechnology, Faculty of Veternary Sciences and Animal Husbandry, Shuhama-Srinagar. A Lecture on

2. STRUCTURE AND ORGANIZATION OF CHROMATIN Presented by ZAFAR IQBAL BUHROO (Research


Scholar)

3. CONTENTS PARTICULARS Page No. 1. INTRODUCTION 2. ULTRASTURCUTRE OF CHROMATIN 2.1.


Multistrand Model 2.2. Folded Fibre Model 2.3. Nucleosome Model 3. ORGANIZATION OF CHROMATIN
3.1. The Nucleosome and “Beads on String” 3.2. 30 nm Chromatin Fibre 3.3. Higher level of DNA packing
into metaphase-chromosome 4. TYPES OF CHROMATIN 4.1. Euchromatin 4.2. Heterochromatin 4.2.1.
Constitutive Heterochromatin 4.2.2. Facultative Heterochromatin 5. COMPOSITION OF CHROMATIN 5.1.
DNA 5.2. Histones 5.3. Non-Histones 6. FUNCTIONS OF CHROMATIN 7. CONCLUSION REFERENCES

4. STRUCTURE AND ORGANIZATION OF CHROMATIN 1. INTRODUCTION The nucleus is the heart of the
cell, which serves as the main distinguishing feature of the eukaryotic cells. It is an organelle submerged
in its sea of turbulent cytoplasm which has the genetic information encoding the past history and future
prospects of the cell. Nucleus contains many thread like coiled structures which remain suspended in
the nucleoplasm which are known as chromatin substance. Chromatin is the complex combination of
DNA and proteins that makes up chromosomes. It can be made visible by staining with specific
techniques and stain (thus the name chromatin which literally means colored material). The major
proteins involved in chromatin are histone proteins; although many other chromosomal proteins have
prominent roles too. The functions of chromatin is to package DNA into smaller volume to fit in the cell,
to strengthen the DNA to allow mitosis and meiosis and to serve as a mechanism to control gene
expression and DNA replication. Chromatin is thus, the mixture of DNA and proteins present in an
organized manner in the chromosomes (Fig. 1). 2. ULTRASTRUCTURE OF CHROMATIN The field of
ultrastructure of chromatin is still the area where electron microscope has failed to provide us a clear
picture of the organization of DNA in chromatin. For the study of chromosomes with the help of electron
microscope, whole chromosome mounts as well as sections of chromosome were studied. Such studies
demonstrated that chromosomes have very fine fibrils having a thickness of 2nm. Since, DNA is 2nm
wide, there is possibility that a single fibril corresponds to a single DNA molecule.

5. Fig.1. Chromatin and condensed structure of chromosome Various workers have proposed different
models to describe the organization of DNA in the chromosomes. Three such models of chromosome
structure are Multi-stranded model and folded fibre model 2.1. Multi-stranded model According to this
model each chromatin fibre is on an average 100 Ao in diameter. Each chromatin fibre is composed of 2
strands. Each strand is 35-40Ao in diameter. Each strand consists of a single double helix structure (The
two are separated by 25 Ao ). Four chromatin fibers (each composed of 2 DNA double helix) coil around
each other to form a Quarter chromatid. Quarter chromatid is the smallest sub-unit of the chromosome
(400 Ao ). Quarter chromatid givesrise to half chromatid (800 Ao ). Half chromatid is made up of 16 DNA
double helices. Two half chromatids coil around each other to produce one chromatid, which is 1600 Ao
in diameter and made up of 32 DNA double helices. Thus a chromosome has 64 DNA double helices,
would be 3200 Ao diameter . 2.2 Folded fibre model

6. A popular model was proposed by E.J. Dupraw in 1965. According to this model, the chromosome is
composed of tightly folded fibre which has a diameter of 200-300 Ao . Each chromosome fibre contains
only one DNA double helix which is in a coiled state. This DNA coil is coated with histones and non-
histone proteins. Folding of the chromatin fibres drastically reduces their length and at the same time
markedly increases their thickness and stainability. This folded structure normally undergoes super
coiling which further reduces the length and thickness of the chromosome. This is the most popular
model. 2.3 Nucleosome model This model was proposed by R.D. Kornberg (1974). According to this
model DNA is tightly bound to histone proteins which serve to form a repeating array of DNA- Protein
particles called Nucleosome. This is the most significant and widely accepted model. 3. ORGANIZATION
OF CHROMATIN Any model of chromatin fibre structure has to account for packing of DNA. The basic
structure shows three levels of organization of chromatin in the chromosome. i) DNA wrapping around
“Nucleosomes” – “the beads on a string” structure. ii) A 30 nm condensed chromatin fibre consisting of
nucleosome arrays in their most compact form. iii) Higher level of packing into the metaphase
chromosome. These three levels of organization are illustrated in Fig. 2 and explained below.

7. Fig.2 Structural organization of chromatin 3.1 The nucleosome and “Beads on string” structure The
first level of packing involves the binding of the chromosomal DNA to histones. In eukaryotes, DNA is
tightly bound to form a repeating array of DNA-protein particles called nucleosomes. Histones play a
crucial role in packing this very long DNA molecule in an orderly way into nucleus which is only a few
micrometers in diameter. Thus, nucleosomes are the fundamental packing unit particles of the
chromatin and give chromatin a “beads on string” appearance in the electron micrographs. Each
nucleosome bead is separated from the next by a region of linker DNA (Fig.3). There are five main types
of histones called H1, H2a, H2b, H3 and H4. Histones are very basic proteins, about 25% of their amino
acids are lysine and arginine. So histones have a large number of positively charged amino acid side
chains. Their positively charged groups therefore bind to the negatively charged phosphate groups of
DNA.

8. Fig.3. Nucleosome with histone H1 The nucleosome beads can be removed from the DNA string by
digestion with enzymes that degrade DNA such as bacterial enzyme-micrococcal nuclease. After
digestion for a short period of time with micrococcal nuclease, only the DNA between the nucleosome
beads (linker DNA) is degraded. The rest is protected from digestion and remains as double stranded
DNA. Fragments 146 base pairs long bound to a specific complex of eight nucleosome histones (the
histone Octamer) (Fig 4). Each nucleosome is a disc shaped particle with a diameter of about 11 nm and
a length of 5.7 nm. It consists of a core histones around which DNA is wound. The core consists of two
discs arranged in parallel, each composed of four histone molecules one each of H2a, H2b, H3 and H4.
The DNA molecule runs along the rim of the discs and a molecule of H1 sits on the outside of the
nucleosome complex acting as a seal; 146 base pairs of DNA are associated with nucleosome core. Each
nucleosome is separated from the next by a region of linker DNA. The length of the linker between
nucleosomes varies between species, in humans it is about 60 base pairs giving a total length of DNA per
nucleosome of 200 base pairs. Generally DNA makes two complete turns around the histone Octamer
and these two turns (200 base pairs long) are sealed by H1 molecule.

9. Thus, on an average, nucleosome repeats at intervals of about 200 nucleotides or base pairs. This is
the basic level of packing of DNA in chromatin. Fig. 4. Histone Octamer (a nucleosome)

10. from Jiang and Pugh, Nature Rev.Genet. 10, 161 (2009) Nucleosomes contain 2 copies of H2A, H2B,
H3 and H4 147 bp of DNA is wrapped around nucleosome Histone tails emanate from core Some
nucleosomes contain histone variants H1 is a linker histone Nucleosome Structure 3 .2 30 nm chromatin
fibre When nuclei are very gently lysed on to the electron microscope grid, most of the chromatin is
seen to be in the form of a fibre, with a diameter of 30 nm. This diameter is larger than a single
nucleosome and suggests that the nucleosomes are organized into a higher order structure. The 30 nm
fibre consists of closely packed nucleosomes. It probably arises from the folding of the nucleosome
chain into a solenoid structure having about six nucleosomes per turn (Klug and Coworkers 1976-85).
The fibre is formed by a histone H1 molecule binding to the linker DNA of each nucleosome at the point
where it enters and leaves the nucleosome. The histone H1 molecules interact with each other, pulling
the nucleosomes together. Thus, H1 molecules are found responsible for packing nucleosomes into 30
nm fibres. The H1 histone molecule has an evolutionary conserved globular core or central region linked
to extended amino acid terminal and carbonyl terminal ‘arms’ whose amino acid sequences has evolved
much more rapidly. Each H1 molecule binds through its globular portion to a unique site on a
nucleosome and has arms that are thought to extend to contact with other sites the histone cores of
adjacent nucleosomes, so that the nucleosomes are pulled together into a regular repeating array, thus
giving the chromatin 30 nm fibre structures.

11. The binding of H1 molecule to chromatin tends to create a local polarity that a chromatin otherwise
lacks (Fig.5). Fig.5 30 nm chromatin fibre showing solenoid structure

12. Fig.6. Organization of chromatin in chromosome


13. 3.3 Higher level of DNA packing into the metaphase chromosome Increasing levels of packing are
observed within the nucleus. The highest level of packing is found in chromosomes at the metaphase
stage of cell division. In an experiment, the histones are removed from the metaphase chromosome by
adding poly anion dextrin sulphate. Histone depleted chromosomes are found to have a central core of
scaffold surrounded by a hallow mode of loops of DNA. The scaffold is made up of non-histone proteins
and retains the general shape of the metaphase chromosome. Each chromosome has two scaffolds, one
for each chromatid and connected together at the centromere region. When the histones are removed,
the DNA which has packed about 40 folds in the 30 nm chromatin becomes extended and produces
loops with an average length of 25 µm with 15,000 base pairs. In each loop the DNA exists from the
scaffold and returns to an adjacent point. On the basis of these observations, a model of chromosome
structure was prepared by Lammli and Coworkers (1979-1904). In Lammli’s radial loop model, DNA is
arranged in loops anchored to non-histone scaffold. Because the lateral loops have 25 µm DNA, after
contracting 40 folds into 30 nm fibre, they would be only about 0.6 µm long, a length consistent with the
diameter of metaphase chromosome (1 µm) shows how the chromatin is arranged so that the base of
the loop forms a scaffold in the center of the chromatid. Thus, in the early stage of cell division the
chromatin strands become more and more condensed. They cease to function as accessible genetic
material (transcription stops) and become a compact transportable form. This compact forms makes the
individual chromosomes visible, and they form the classic four arm structure, a pair of sister chromatids
attached to each other at the centromere. During division long microtubules attach to the centromere
at the opposite ends of the cell. The microtubules then pull the chromatids apart, so that each daughter
cell inherits one set of chromatids. Once the cells have divided the chromatids are uncoiled and can
function again as chromatin. Inspite of their appearance, chromosomes are highly condensed, which
enable these giant DNA structures to be contained with in a cell nucleus.

14. Fig.4. Higher Levels of DNA packing in Chrosome 4) TYPES OF CHROMATIN

15. Two distinct types of chromatin have been distinguished depending on their staining properties as
Euchromatin and Heterochromatin 4.1 Euchromatin It is the lightly packed form of chromatin that is rich
in gene concentration. This chromatin takes up light stain and represent most of the chromatin, that
disperse after mitosis has completed. Euchromatin consists of structural genes which replicate and
transcribe during G1 and S phase of the interphase. Euchromatin is considered genetically active
chromatin, since it has a role in their phenotypic expression of the genes. In euchromatin, DNA is found
packed in 3-8 mm fibre. During metaphase it takes up dark stain. 4.2 Heterochromatin It is a tightly
packed form of chromatin that takes up deep stain during interphase and prophase but metaphase
takes up light stain. Chromomeres, centromeric regions, and knobs also take up dark staining, of which
centromeric regions and knobs are the true Heterochromatic. (chromomeres are transcribed so not true
H.C.). IN the chromosomes all the centromeres fuse to form a long Heterochromatic mass called
chromocentre. Heterochromatin consists of highly repetitive DNA sequences. It is late replicating during
the s-phase of the cell and is not transcribed. Heterochromatin has been further classified into two
types: Constitutive heterochromatin and Facultative heterochromatin. 4.2.1 Constitutive
heterochromatin In such a heterochromatin, the DNA is permanently inactive and remains in the
condensed state throughout the cell cycle. This most common type of heterochromatin occurs around
the centromere, in the telomeres and in the C-bands of the chromosomes. It takes up deep stain.

16. Constitutive heterochromatin contains short repeated sequences of DNA called satellite DNA. This
DNA is called satellite DNA because upon ultra centrifugation, it repeats from the main component of
DNA. 4.2.2 Facultative Heterochromatin This is essential euchromatin that has undergone
heterochromatinization. It is not permanently maintained in the condensed state instead it undergoes
periodic dispersal when ever it becomes transcriptionally active. Frequently in facultative
heterochromatin, one chromosome of the pair becomes either totally or partially heterochromatic. An
example of facultative heterochromatin is x- chromosome inactivation in female mammals; one x-
chromosome is packaged in facultative heterochromatin and silenced, while the other x-chromosome is
packaged in euchromatin and is expressed. The silenced chromosome is inactive and forms at
interphase-the sex chromatin or Barr body (named after Murray L. Barr). Bar body contains DNA which is
not transcribed and is not found in males. 5. CHEMICAL COMPOSITION OF CHROMATIN Chromatin is
composed of 20-40% of DNA, 50-65% of proteins and 05-10% of RNA. They vary from species to species
and also among the tissue of the same species 5.1 DNA of Chromatin DNA is the most important
chemical constituent of chromatin, since it plays the central role of controlling heredity. The most
convenient measurement of DNA is picogram. C-value: The DNA in nuclei was stained using the fulgen
reactions and the amount of stain in single nuclei was measured using a special microscope called
cytophotometer. This technique confirmed that nuclei contain a constant amount of DNA. Thus, all the
cells in an organism contain the same DNA content (2C) provided that they are diploid. Gametes are
haploid and therefore have half the DNA contents (1C). Some tissues such as

17. liver contains occasional cells that are polyploidy and their nuclei have a correspondingly higher DNA
content (4C or 8C). Thus, each species has a characteristic content of DNA which is constant in all the
individuals of that species and thus have been called the C-value. 5.2 RNA of Chromatin Chromatin has
05-10% of RNA, which is associated with chromatin as: Ribosonal RNA –(rRNA) Messenger RNA –
(mRNA) and Transfer RNA – (tRNA) 5.3 Proteins of Chromatin Proteins associated with chromatin are
classified into two groups: i) Histones ii) Non-histones 5.1.1 Histones Histones are very basic proteins
because they constitute about 60% of total chromatin protein, almost 1:1 ratio with DNA. Histones are
basic proteins because they are enriched in amino acid arginine and lysine (they are devoid of
tryptophan). There are five types of histones in the eukaryotic chromosomes, namely H1, H2A, H2B, H3
and H4. One of the important discoveries that has came from chemical studies is that H2a, H2b, H3 and
H4 are highly conserved during the evolutionary history. They play a primary role in chromatin
organization. While the H1 histone is least rigidity conserved protein. It is present only once per 200
base pairs of DNA and is rather loosely associated with DNA. It is absent in yeast (Sacchromyces
cervisiae).

18. 5.1.2 Non-histones: They are 20% of total chromatin protein and the amount is variable. About 50%
non-histone proteins of chromatin have been found to be structural proteins and include such proteins
as actin, L- and B- tubulin and myosin. These contractile proteins function during chromosome
condensation and in the movement of chromosomes during mitosis and meiosis. Many of the remaining
50% of non-histonse include all the enzymes and factors that are involved in replication, transcription
and regulation of transcription. These proteins are not as highly conserved among organism. 6.
FUNCTION OF CHROMATIN The function of the chromatin is to carry out the genetic information from
one generation to another, by encoding the past history and future prospects of the cell. DNA, being the
only permanent component of chromatin, is the sole genetic material of eukaryotes. It never leaves the
cell, thus maintaining heredity of the cell. 7. CONCLUSION • Chromatin is the complex combination of
DNA and proteins that organizes chromosomes which appear as many thread like coiled and elongated
structures suspended in the nucleoplasm. So the chromatin contains genetic material instructions to
direct cell function. • The first level of packing in Chromatin involves the binding of DNA to histones into
fundamental packing unit particles called nucleosomes. • The second level of packing involves packing of
nucleosomes into 30 nm thick chromatin fibre. • The highest level of packing of chromatin in the
chromosome is found at the metaphase stage of cell division. • There are two distinct types of
chromatin- euchromatin and heterochromatin which differ on their staining properties.

19. • In the chromatin, DNA and basic proteins called histones are present in about equal amounts. •
DNA is the permanent component of chromosomes and is the sole genetic material of eukaryotes.

20. REFERENCES 1. De Robertis and De Robertis (1998) cell and Molecular Biology, Lea & Febiger,
Hongkong. 2. Ringo.J. (2004), Fundamental Genetics, Cambridge University Press. 3. Winter P.C. Hickey
G.I and Fletcher. H. L. (1999), Instant notes on Genetics, Viva Books Pvt. Ltd. 4. Hames B.D. and Hooper
N.M. (2001), Instant notes on Biochemistry, Viva Books Pvt. Ltd. 5. Verma P.S. and Agarwal V.K. (2004),
Cell biology, Genetics, Molecular biology and Evolution, S. Chand and Co. Ltd.
Section 9.2Chromosomal Organization of Genes and Noncoding
DNA
Having reviewed the relationship between transcription units and genes
in prokaryotes and eukaryotes, we now consider the organization of genes on chromosomes and
the relationship of noncoding DNA sequences to coding sequences.
Go to:

Genomes of Higher Eukaryotes Contain Much Nonfunctional DNA


The abundance of noncoding sequences in the genomes of higher organisms is illustrated
in Figure 9-3, which depicts the protein-coding regions in an 80-kb stretch of DNA from the
yeast S. cerevisiae and in the β-globin gene cluster of humans, also about 80 kb long. Note that
in the single-celled yeast, protein-coding regions are closely spaced along the DNA sequence,
whereas only a small fraction of the human DNA encodes protein. DNA sequencing and
identification of exons has revealed that in higher organisms there is a considerable amount of
DNA that does not encode protein. In fact, the β-globin gene cluster is unusually rich in protein-
coding sequences compared with other regions of vertebrate DNA. In the 60-kb region including
the chicken lysozyme gene, for example, the coding exons total less than 500 base pairs. Because
no function has yet been found for most of the noncoding DNA in higher eukaryotes, it is
commonly referred to as nonfunctional.

Figure 9-3
Diagrams of ≈80-kb region from chromosome III of the yeast S. cerevisiae and the β-globin gene
cluster on human chromosome 11. (a) In the yeast DNA, blue boxes indicate open reading
frames; it is not clear whether all these potential protein-coding (more...)
Different selective pressures during evolution may account, at least in part, for this remarkable
difference in the amount of nonfunctional DNA in microorganisms and multicellular organisms.
For example, microorganisms must compete for limited amounts of nutrients in their
environment, and metabolic economy thus is a critical characteristic. Since synthesis of
nonfunctional (i.e., noncoding) DNA requires time and energy, presumably there was selective
pressure to lose nonfunctional DNA during the evolution of microorganisms. On the other hand,
natural selection in vertebrates depends largely on their behavior. The energy invested in DNA
synthesis is trivial compared with the metabolic energy required for the movement of muscles;
thus there was little selective pressure to eliminate nonfunctional DNA in vertebrates.
Go to:

Cellular DNA Content Does Not Correlate with Phylogeny


The total amount of chromosomal DNA in different animals and plants does not vary in a
consistent manner with the apparent complexity of the organisms. Yeasts, fruit flies, chickens,
and humans have successively larger amounts of DNA in their haploid chromosome sets (0.015,
0.15, 1.3, and 3.2 picograms, respectively), in keeping with what we perceive to be the
increasing complexity of these organisms. Yet the vertebrates with the greatest amount of DNA
per cell are amphibians, which are surely less complex than humans in their structure and
behavior. Many plant species also have considerably more DNA per cell than humans have. For
example, the DNA content per cell of wheat, broad beans, and garden onions (7.0, 14.6, and 16.8
picograms, respectively) ranges from about two to more than five times that of humans, and
tulips have ten times as much DNA per cell as humans.
The DNA content per cell also varies considerably among closely related species. All insects or
all amphibians would appear to be similarly complex, but the amount of haploid DNA in species
within each of these phyla varies by a factor of 100. The same variation in DNA content per cell
is common within groups of plants that have similar structures and life cycles. For example, the
broad bean contains about three to four times as much DNA per cell as the kidney bean.
These facts further suggest that much of the DNA in certain organisms is “extra” or expendable
— that is, it does not encode RNA or have any regulatory or structural function. The total
amount of DNA per haploid cell in an organism is referred to as the C value; the failure of C
values to correspond to phylogenetic complexity is called the C-value paradox. This perplexing
variation in genome size occurs mainly because eukaryotic chromosomes contain variable
amounts of DNA with no demonstrable function, both between genes and within genes in
introns. As discussed later, much of this apparently nonfunctional DNA is composed of
repetitious DNA sequences, some of which are never transcribed and most all of which are likely
dispensable. The different classes of eukaryotic DNA sequences discussed in the following
sections are summarized in Table 9-1.

Table 9-1
Classification of Eukaryotic DNA.
Go to:

Protein-Coding Genes May Be Solitary or Belong to a Gene Family


In multicellular organisms, roughly 25 – 50 percent of the protein-coding genes are represented
only once in the haploid genome and thus are termed solitary genes. The remaining protein-
coding genes belong to families comprising two or more similar genes.
A well-studied example of a solitary protein-coding gene is the chicken lysozyme gene
mentioned previously. The 15-kb DNA sequence encoding chicken lysozyme constitutes a
simple transcription unit (i.e., a single gene) containing four exons and three introns (Figure 9-4).
The flanking regions, extending for about 20 kb upstream and downstreamfrom the transcription
unit, do not encode any detectable mRNAs. Lysozyme, an enzyme that cleaves the
polysaccharides in bacterial cell walls, is an abundant component of chicken egg-white protein
and also is found in human tears. Its activity helps to keep the surface of the eye and the chicken
egg sterile.

Figure 9-4
The chicken lysozyme gene and its surrounding regions. This 15-kb simple transcription unit
contains four exons (blue) and three introns (tan). The positions indicated by red arrows are
repeated Alu sequences found at many sites elsewhere in the genome. (more...)
Frequently, the DNA that lies within 5 – 10 kb of a particular gene contains sequences that are
close but inexact copies of the gene. Such sequences, which are thought to have arisen by
duplication of an ancestral gene, are referred to as duplicated protein-coding genes; duplicated
genes probably constitute half of the protein-coding DNA in vertebrate genomes. A set of
duplicated genes that encode proteins with similar but nonidentical amino acidsequences is
called a gene family; the encoded closely related, homologous proteins constitute a protein
family. A few protein families, such as protein kinases, transcription factors, and vertebrate
immunoglobulins, include hundreds of members. Most families, however, include from just a
few to 30 or so members; common examples are cytoskeletal proteins, 70-kDa heat-shock
proteins, myosin heavy chain, chicken ovalbumin, and the α- and β-globins in vertebrates.
The genes encoding the β-like globins are a good example of a gene family. As shown in Figure
9-3b, the β-like globin gene family contains five functional genes designated β, δ, Aγ, Gγ, and ϵ;
the encoded polypeptides are similarly designated. Two identical β-like globin polypeptides
combine with two identical α-globin polypeptides (encoded by another gene family) and with
four small heme groups to form a hemoglobin molecule (see Figure 3-10). All the hemoglobins
formed from the different β-like globins carry oxygen in the blood, but they exhibit somewhat
different properties that are suited to specific roles in human physiology. For example,
hemoglobins containing either the Aγ or Gγ polypeptides are expressed only during fetal life.
Because these fetal hemoglobins have a higher affinity for oxygen than adult hemoglobins, they
can effectively extract oxygen from the maternal circulation in the placenta. The lower oxygen
affinity of adult hemoglobins, which are expressed after birth, permits better release of oxygen to
the tissues, especially muscles, which have a high demand for oxygen during exercise.
The different β-globin genes probably arose by duplication of an ancestral gene, most likely as
the result of an “unequal crossover” during recombination in a germ-cell (egg or sperm)
precursor (Figure 9-5). Over evolutionary time the two copies of the gene that resulted
accumulated random mutations; beneficial mutations that conferred some refinement in the basic
oxygen-carrying function of hemoglobin were retained by natural selection. Repetitions of this
process are thought to have resulted in the evolution of the contemporary globin-like genes
observed in humans and other complex species today.

Figure 9-5
Gene duplication resulting from unequal crossing over. Each parental chromosome (top) contains
one ancestral globin gene containing three exons and two introns. Homologous L1 repeated
sequences lie 5′ and 3′ of the globin gene. The parental (more...)
Two regions in the human β-like globin gene cluster contain nonfunctional sequences,
called pseudogenes, similar to those of the functional β-like globin genes (see Figure 9-3b).
Sequence analysis shows that these pseudogenes have the same apparent exon-intron structure as
the functional β-like globin genes, suggesting that they also arose by duplication of the same
ancestral gene. However, sequence drift during evolution generated sequences that either
terminate translation or block mRNA processing, rendering such regions nonfunctional even if
they were transcribed into RNA. Because such pseudogenes are not deleterious, they remain in
the genome and mark the location of a gene duplication that occurred in one of our ancestors. As
discussed in a later section, other nonfunctional gene copies can arise by reverse transcription of
mRNA into cDNA and integration of this intron-less DNA into a chromosome.
Several different gene families encode the various proteins that make up the cytoskeleton. These
proteins are present in varying amounts in almost all cells. In vertebrates, the major cytoskeletal
proteins are the actins, tubulins, and intermediate filament proteins like the keratins (Chapters 18
and 19). Although the physiologic rational for these protein families is not as obvious as it is for
the globins, the different members of a family probably have similar but subtly different
functions suited to the particular type of cell in which they are expressed.
Go to:

Tandemly Repeated Genes Encode rRNAs, tRNAs, and Histones


In invertebrates and some vertebrates, the genes encoding rRNAs, tRNAs, histones (a family of
proteins associated with eukaryotic nuclear DNA), and several other proteins occur as tandemly
repeated arrays. These are distinguished from the duplicated genes of gene families in that the
multiple tandemly repeated genes encode identical or nearly identical proteins or functional
RNAs. Most often copies of a sequence appear one after the other, in a head-to-tail fashion, over
a long stretch of DNA. Within a tandem array of rRNA or tRNA genes, each copy is exactly, or
almost exactly, like all the others. Although the transcribed portions of rRNA genes are the same
in a given individual, the nontranscribed spacer regions between the transcribed regions can
vary. Arrays of tandemly repeated histone DNA are somewhat more complex; however, each
histone gene, too, has multiple identical copies.
The tandemly repeated rRNA, tRNA, and histone genes are needed to meet the great cellular
demand for their transcripts. Most of the RNA in a cell consists of rRNA and tRNA.
Assuming RNA polymerase molecules move at a fixed speed, there must be a limit to the
number of RNA copies that transcription of a single gene can provide during one cell generation,
even if it is fully loaded with polymerase molecules. If more RNA is required than can be
transcribed from one gene, multiple copies of the gene are necessary. For example, during early
embryonic development in humans, many embryonic cells have a doubling time of ≈24 hours
and contain 5 – 10 million ribosomes. To produce enough rRNA to form this many ribosomes,
an embryonic human cell needs at least 100 copies of the pre-rRNA gene, and most of these must
be close to maximally active for the cell to divide every 24 hours (Table 9-2). That is, multiple
RNA polymerases must be loaded onto and transcribing each pre-rRNA gene at the same time
(see Figure 11-49). The importance of repeated rRNA genes is illustrated by Drosophila mutants
called bobbed (because they have stubby wings), which lack a full complement of the tandemly
repeated rRNA genes. A bobbed mutation that reduces the number of rRNA genes to less than
≈50 is a recessive lethal mutation.

Table 9-2
Effect of Gene Copy Number and Loading with RNA Polymerase on Rate of Pre-rRNA
Synthesis in Human Cells.
All eukaryotes, including yeasts, contain 100 or more copies of the genes encoding 5S rRNA
and pre-rRNA. More than 20,000 copies of the 5S rRNA gene are present in frogs. The copy
number for individual tRNA genes ranges from 10 to 100.
Go to:

Reassociation Experiments Reveal Three Major Fractions of Eukaryotic


DNA
Besides duplicated protein-coding genes and tandemly repeated genes, eukaryotic cells contain
multiple copies of other DNA sequences in the genome, generally referred to as repetitious DNA
(see Table 9-1). Some of these sequences are quite short and occur as tandem repeats; others are
much longer and are interspersed at many places in the genome. The existence of these repeated
sequences was first recognized in reassociation experiments in which denatured eukaryotic DNA
was observed to renature nonuniformly; that is, some of it reassociated much more rapidly than
the bulk of cellular DNA.
In these studies, the total DNA of an organism was broken into fragments with an average length
of about a thousand base pairs. The DNA was then melted into single strands and placed under
conditions that allow strand reassociation to occur (e.g., a favorable ion concentration and a
favorable temperature). If none of the DNA fragments contained sequences that were repeated in
the genome, they all would be expected to re-form duplexes at about the same speed. However, a
fragment containing a sequence repeated many times in the genome would find
a complementary partner more quickly than a fragment with a sequence that occurred only once
per haploid genome, because the repeated sequence would be present at a much higher
concentration. Consequently, a fragment containing a repeated sequence would reassociate faster
than a fragment with a unique sequence.
About 50 – 60 percent of mammalian DNA reassociates at a slow rate indicating that it consists
primarily of single-copy DNA. According to Mendelian genetics, only one copy of each gene is
contained in the haploid DNA set; thus the single-copy DNA fraction is expected to contain most
of the genes encoding mRNA. However, the vast majority of single-copy DNA in the
mammalian genome is noncoding DNA between genes and in introns. It appears that only a
small fraction of the total DNA in humans, on the order of 5 percent, actually encodes proteins or
functional RNAmolecules. The remainder of the single-copy DNA, which currently has no
known function other than to separate functional DNA sequences, is referred to as spacer DNA.
Another 25 – 40 percent of mammalian DNA reassociates at an intermediate rate. Cloning and
sequencing of this DNA fraction from many different animals and higher plants have revealed
that it is composed primarily of a very large number of copies of a relatively few sequence
families in any specific organism. Such repetitious DNA, termedmoderately repeated
DNA, or intermediate-repeat DNA, is interspersed throughout mammalian genomes. Because
these sequences can be copied and reinserted into new sites in the genome, they are
called mobile DNA elements, which we describe in the next section. A small portion of this
fraction consists of large duplicated gene families and tandemly repeated genes discussed
previously.
About 10 – 15 percent of mammalian DNA reassociates at a very rapid rate. This rapidly
reassociating type of repetitious DNA, referred to as simple-sequence DNA, is composed largely
of several different sets of short (5- to 10-bp) sequences repeated in long tandem arrays.
Go to:

Simple-Sequence DNAs Are Concentrated in Specific Chromosomal


Locations
Although much of the simple-sequence DNA of higher organisms is composed of tandemly
repeated, 5- to 10-bp sequences, long tandem repeats of simple sequences containing 20 – 200
nucleotides also occur in some vertebrate and plant genomes. Such tandem repeats generally
extend up to 105 base pairs in total length. These long stretches of simple-sequence DNA are
often referred to as satellite DNA because they are separated from the bulk of cellular DNA by
equilibrium density-gradient centrifugation. However, not all simple-sequence DNAs separate
from the bulk of cellular DNA during centrifugation.
In situ hybridization studies with metaphase chromosomes have localized simple-sequence
DNA to specific chromosomal regions. In most mammals, much of the simple-sequence DNA
lies near centromeres, discrete chromosomal regions that attach to
spindle microtubules during mitosis and meiosis (see Figure 19-39). In the chromosomes
of Drosophila melanogaster, simple-sequence DNA is concentrated in both centromeres
and telomeres, the ends of chromosomes. Some simple-sequence tandem arrays also are located
within chromosome arms in the Drosophila genome. In humans, some simple-sequence DNAs
are located at a specific location on one chromosome. These sequences are useful for identifying
particular chromosomes by fluorescence in situ hybridization (FISH). For example, a particular
simple sequence in the human genome is present only in the middle of the long arm of
chromosome 16 (Figure 9-6).

Figure 9-6
Use of simple-sequence DNA as chromosomal marker. Human metaphase chromosomes stained
with a fluorescent dye and hybridized in situ with a particular simple-sequence DNA labeled
with a fluorescent biotin derivative. When viewed under the appropriate wavelength (more...)
Simple-sequence DNA located at centromeres is suspected to contribute to the structure and
therefore the function of the kinetochore of metaphase chromosomes. This large nucleoprotein
complex assembles at the centromere and attaches to
spindle microtubules during mitosis (Chapter 19). As yet, however, there is little clear-cut
experimental evidence demonstrating any function for most simplesequence DNA.
Go to:

DNA Fingerprinting Depends on Differences in Length of Simple-


Sequence DNAs
Within a species, the nucleotide sequences of the repeat units composing simple-sequence
DNA tandem arrays are highly conserved among individuals. In contrast, differences in
the number of repeats, and thus in the length, of simple-sequence tandem arrays containing the
same repeat unit are quite common among individuals. These differences in length result from
unequal crossing over within regions of simple-sequence DNA during developmentof sperm
and oocyte precursors and during meiosis (Figure 9-7). As a consequence of this unequal
crossing over, the lengths of some simple-sequence tandem arrays are unique in each individual.

Figure 9-7
Unequal crossing over during meiosis can generate differences in lengths of simple-sequence
DNA tandem arrays. In this example, unequal recombination within a stretch of DNA containing
six copies (1 – 6) of a particular simple-sequence repeat (more...)
In humans and other mammals, some of the simplesequence DNA exists in relatively short 1- to
5-kb regions made up of 20 – 50 repeat units each containing 15 to about 100 base pairs. These
regions are called minisatellites to distinguish them from the more common regions of tandemly
repeated simple-sequence DNA, which are ≈100 kb in length. The sequences of the repeat unit in
two human minisatellites are shown in Figure 9-8. Even slight differences in the total lengths of
various minisatellites from different individuals can be detected. These differences form the
basis of DNA fingerprinting, which is superior to conventional fingerprinting for identifying
individuals (Figure 9-9).

Figure 9-8
Consensus sequences of the repeat unit of human minisatellites named λ33.1 and λ33.5 based on
analysis of more than ten sets of repeats in each case. Red letters indicate positions in which base
differences have been detected; red solid (more...)

Figure 9-9
Human DNA fingerprints. DNA samples from three individuals (1, 2, and 3) were subjected to
Southern-blot analysis using the restriction enzyme Hinf1 and three labeled minisatellites as
probes (λ33.6, 33.15, and 33.5; lanes a, b, and c, respectively). (more...)
Go to:

SUMMARY
 The genomes of prokaryotes and lower eukaryotes contain few nonfunctional sequences,
whereas vertebrate genomes contain many sequences that do not code for RNAs or have
any structural or regulatory function. Only about 5 percent of the genomic DNA in
humans encodes proteins or functional RNAs.
 The lack of a consistent relationship between the amount of DNA in
the haploid chromosomes of an animal or plant and its phylogenetic complexity is called
the C-value paradox.
 About half of the protein-coding genes in vertebrate genomic DNA are solitary genes,
whose sequence occurs only once in the haploid genome. The remainder are duplicated
genes, which arose by duplication of an ancestral gene and subsequent independent
mutations (see Figure 9-5).
 Duplicated genes, such as those forming the β-like globin gene family, encode closely
related proteins and generally appear as a cluster in a particular region
of DNA (see Figure 9-3). The proteins encoded by a gene family have homologous but
nonidentical amino acid sequences and exhibit similar but slightly different properties.
 In invertebrates and vertebrates, rRNAs, tRNAs, and histone proteins are encoded by
multiple copies of genes located in tandem arrays in genomic DNA.
 Single-copy DNA consists of solitary protein-coding genes, small
duplicated gene families, and spacer DNA.
 Moderately repeated DNA includes the tandemly repeated genes encoding, rRNA, tRNA
genes, and histones; large duplicated gene families; and mobile DNA elements.
 Simple-sequence DNA, which consists largely of very short sequences repeated in long
tandem arrays, is preferentially located in centromeres, telomeres, and specific locations
within the arms of particular chromosomes.
 The length of a particular simple-sequence tandem array is quite variable among
individuals in a species, probably because of unequal crossing
over during meiosis (see Figure 9-7). Differences in the lengths of some simple-sequence
tandem arrays forms the basis for DNA fingerprinting.

You might also like