You are on page 1of 77

Fundamental

Molecular Biology
Second Edition

Lisabeth A. Allison

Chapter 5
Genome Organization and Evolution

Copyright © 2012 John Wiley & Sons, Inc. All rights reserved.
Cover photo: Julie Newdoll/www.brushwithscience.com “Dawn1 of the
Double Helix”, oil and mixed media on canvas, © 2003
Outline

5.1 Introduction
5.2 Genome organization varies in different organisms
5.3 Packaging of the eukaryotic genome
5.4 The majority of the eukaryotic genome is noncoding
5.5 Lateral gene transfer in the eukaryotic genome
5.6 Prokaryotic and viral genome organization

2
5.1

• DNA is associated with architectural proteins and


packaged into chromosomes.

• But, genetic information has to be accessible for


processes such as replication and transcription.

• Genomes are mosaic and reflect a complex


evolutionary history.

3
5.2

The domains of life

• Scientists first divided life into prokaryotes and


eukaryotes.

• Comparison of 16S rRNA sequences and


ribosome structure revealed the archaea.

• Three major groups of living things: eukaryotes,


bacteria, and archaea.

4
Two models for the divisions of life
• Three domain tree: bacteria, eukaryotes, archaea (Fig.
5.1A)

• Eocyte tree: bacteria and archaea (Fig. 5.1B)


Eukaryotes are a type of archaea derived from ancestral cells
called eocytes.

5
Last Universal Common (Cellular)
Ancestor (LUCA)

• A first simple life-form that existed on Earth more


than 4 billion years ago.

• All life evolved from LUCA.

6
5.2

Genome size
Organization
Black: viruses
Protein-coding genes Green: bacteria
Red: archaea
Blue: unicellular
eukaryotes
Orange: multicellular
eukaryotes

7
Viruses with DNA genomes
• Not considered organisms because they are not made
of cells.

• But, viruses have a genome and they evolve.

8
Two classes of genomes

• Small genomes of viruses, archaea, bacteria (<10 Mb),


and many unicellular eukaryotes (<20 Mb)
– Protein-coding and RNA-coding sequences occupy most of
the nucleotide sequences.

• Large genomes of multicellular and some unicellular


eukaryotes (>100 Mb)
– The majority of the nucleotide sequence is non-coding.

9
10
11
5.3

The problem:
• How to fit 2 meters of DNA into a <10 µm space.

The solution:
• Double-stranded linear DNA molecules are packed
into chromatin.

12
Diversity in the number of chromosomes that
make up eukaryotic genomes

• Butterflies: >200 chromosomes


• Kangaroos: 12
• Humans: 46
• Adder tongue fern: 1,260
• Male jack jumper ant: 1

13
Females: 2 Males: 1

14
Histones are small, positively charged
proteins

1928: Albrecht Kossel isolated histones, small basic


proteins, from the nuclei of goose erythrocytes.

1970s: Electron microscopic and biochemical studies


showed that the fundamental packing unit of
chromatin is the nucleosome (composed of histone
proteins).

15
Two types of
histones:
• Highly conserved
core histones.
• More variable linker
histones.

16
Core histones

• Small, positively-charged, basic proteins.

• Molecular weight 11,000-16,000 daltons.

• Histones H2A, H2B, H3, and H4.

• Rich in arginine and lysine.

• Bound to DNA in eukaryote chromosomes to form


core octamers.

17
Linker histones
• Slightly larger, positively-charged, basic proteins.

• Molecular weight >20,000 daltons.

• Histones H1, H5, H1, etc.

• Occur between core octamers.

18
Nuclease treatment

19
Linker histones

Core histones

20
• Most eukaryotes package their genomes with
histones.
• There are some exceptions:
– Dinoflagellates package their DNA with small basic non-
histone proteins.

– Sperm DNA is compacted with basic proteins known as


protamines.

21
Nucleosomes are the fundamental packing
unit of chromatin

Beads-on-a-string: the 10 nm fiber

• Visualized by electron microscopy as 10-11 nm fiber


after low salt extraction.
• Beads represent DNA wrapped around the histone
core octamer.
• String represents the DNA double helix.

22
Nucleosomes
• Repeating structural element in eukaryotic
chromosomes.
• Core octamer of histones plus one molecule of the
linker histone.
• 180 bp DNA wound around.

23
Core histone octamer

• Dimer of histones H2A and H2B at each end.

• Tetramer of histones H3 and H4 in the center.

• 146 bp of negatively charged DNA wraps nearly twice around


the positively charged histones.

24
Histone fold domain
Carboxyl (C) terminal end
– Extended histone-fold domain
– Histone-histone interactions
– Histone-DNA interactions

Amino (N) terminal charged “tails”


– Lysine-rich
– Sites of many post-translational modifications

25
Higher order structure:
the 30 nm fiber (Fig. 5.5)

• Visualized by electron microscopy in higher salt.

• Two models:
1. Classic solenoid model (not seen at physiological salt
concentrations)
2. Currently favored zig-zag ribbon model

26
27
Nucleosome

https://www.youtube.com/watch?v=4Z4KwuUfh0A

28
Further packaging of DNA involves loop
domains
• Further compaction of the 30 nm fiber into loops that contain 50-
100 kb of DNA.

• Insight into loop structure comes from studies of lampbrush


chromosomes in amphibian oocytes.

• In interphase of the cell cycle, the packing ratio is 1000-fold.

29
lampbrush chromosomes

30
Fully condensed chromatin: metaphase
chromosomes

• Condensation requires ATP-hydrolyzing enzymes


and the condensin complex.
• Packing ratio of 10,000-fold.

• Each chromosome is composed of one linear, double-


stranded molecule of DNA.

31
The centromere provides the site of attachment
for segregation during cell division
• A fully condensed metaphase chromosome consists
of two sister chromatids connected at the centromere.

• From the centromere, the kinetochore captures


spindle microtubules, which ensure that sister
chromatids segregate correctly to daughter cells.

• Kinetochore: a structure that forms at a special site on


the centromere of the chromosome
32
33
During mitosis, chromosomes:

• Condense
• Congregate at the metaphase plate
• Orient
• Attach to microtubules
• Are pulled apart

34
Centromere DNA typically is:

• Localized to a specific region of the chromosomes.

• Consists of many repeated DNA sequences spanning


0.1-4 Mb.

35
36
Centromere structure
• Centromere DNA has little or no sequence
conservation.

• Centromere location is specified by the formation of a


specialized chromatin structure.

• The centromere-specific histone H3 variant CenH3


(=CENP-A) triggers a complex network of interactions,
leading to the fully assembled kinetochore (Fig. 5.12).

37
38
Each chromosome must contain:
• A centromere
• One or more origins of replication
• A telomere at each end

Chromosome classification
• Metacentric: centromere in the middle
• Acrocentric: centromere toward one end
• Telocentric: centromere at the end

39
Autosomes and sex chromosomes

• Chromosomes are classified as sex chromosomes or


autosomes.
• The number, size, and shape of the chromosomes make a
species-specific set or karyotype.

40
Examples of diversity in sex chromosome
systems
– Humans: XX (female) and XY (male)
– Birds: ZW (female) and ZZ (male)
– Insects: XX (female), and X (male)
– Duck-billed platypus: XXXXX,XXXXX (female) and
XXXXX, YYYYY (male)

41
Organization and expression of the
genetic material

Heterochromatin: chromatin that is condensed and


suppresses transcription

Euchromatin: chromatin that is more open and allows for


gene activation

42
Eukaryotic gene expression is regulated at three
levels
• DNA sequence: DNA-binding proteins associate with
regulatory elements in the DNA.

• Chromatin structure: changes in the way the DNA is


wrapped around the histones.

• Nuclear architecture: positioning of chromosomes in


“territories” in the nucleus.

43
• Early insights into how chromatin structure
changes during transcription have come from
studies of polytene chromosomes.

• Chromosome puffs represent sites of high


transcriptional activity.

Light
staining

Drosophila polytene chromosomes 44


5.4

C-value paradox

• The observation that the amount of DNA in the


haploid genome is not related to an organism’s
evolutionary complexity.
– e.g. wheat has 16,000 Mb of DNA, while humans only
have 3,200 Mb.

• Most genomic DNA consists of various classes of


repetitive DNA sequences.

45
Organization of the human genome

• Less than 40% of the human genome is comprised of


genes and gene-related sequences.
• Intergenic DNA (~60%) consists of unique or low
copy number sequences and moderately to highly
repetitive sequences.

46
47
Repetitive DNA sequences are divided into
two major classes

• Interspersed elements

• Tandem repetitive elements

48
Interspersed elements are primarily
transposable elements (Fig. 5.16)

Genome-wide repeats that are primarily degenerate


copies of transposable elements (TE)
• Short interspersed nuclear elements (SINEs)

• Long interspersed nuclear elements (LINEs)

49
Transposable elements
(TE)

TE: DNA sequences that have the ability to move to a new


50
location in the genome
Tandem repetitive sequences (~10%) are
arranged in arrays with variable numbers of
repeats

Three subdivisions based on their length


– Satellite DNA
– Minisatellites
– Short tandem repeats (STRs)

51
Satellite DNA

• Very highly repetitive DNA with repeat lengths of


one to several thousand base pairs.
• Buoyant density during density gradient
centrifugation differs from that of the bulk of the
DNA (Fig. 5.17).
• Large clusters in the heterochromatic regions of
chromosomes near centromeres and telomeres

52
42% 30%

53
5.5

• Lateral or horizontal transfer is the transfer of DNA


between two different species, especially distantly
related species.

• Important mechanism for bacterial evolution; in


particular, through movement of transposable elements.

• Evidence is accumulating for the importance of lateral


transfer in fungi, animal, and plant evolution.

54
Organelle genomes reflect an
endosymbiont origin
• Both mitochondria and chloroplasts contain their own
genetic information.
• Endosymbiont hypothesis: both organelles are
derived from primitive, free-living, bacterial-like
organisms.
• Inherited independently of the nuclear genome.

• Uniparental mode of inheritance: organelles are only


contributed from the maternal gamete.
55
56
Chloroplast DNA (cpDNA)

• Circular (?) or linear (?) double-stranded DNA


molecule -> enzymes involved in photosynthesis

• 120-160 kb

• Multiple copies (20-40) per organelle.

• Different buoyant density and base composition


compared with nuclear DNA.

57
Mitochondrial DNA (mtDNA)

• Typically a circular, double-stranded DNA molecule.

• Linear in yeast and some other fungi.

• In animals, typically 16-18 kb.

• In plants, 100 kb to 2.5 Mb.

• Multiple copies (several to 30) per organelle.

58
Mitochondrial DNA and disease

• Defects in mtDNA can lead to degenerative


disorders, e.g.
Leber’s hereditary optic neuropathy (LHON)
Kearns-Sayre syndrome

• Heteroplasmy leads to differences in the severity


and the kind of symptoms.

59
Homoplasmy
• All of the mtDNA within cells of an individual are
identical.

Heteroplasmy
• Mutation occurring in one copy of mtDNA can result
in both mutant and normal mtDNA within the same
cell.

• An individual may have some tissues enriched for


normal mtDNA and others enriched for mutant
mtDNA.
60
Intercompartmental DNA transfer

• A special form of
lateral gene transfer.

• Associated with the


gradual loss of an
endosymbiont’s
independence on the
path to becoming an
organelle.

61
Known types of interorganelle transfer:
– Mitochondrion to nucleus
– Chloroplast to nucleus
– Chloroplast to mitochondrion
– Nucleus to mitochondrion
– Mitochondrion to chloroplast

62
• Eukaryotic genomes are mosaic - the product of a
complicated evolutionary history.

• Most human genes were transferred from an


endosymbiont:
• Genes of archael origin are involved in information
processing.
• Genes of bacterial origin are associated with metabolism
and cell structure.
• The proteins that make the nuclear envelope are encoded
by genes of both archael and bacterial origin.

63
Bacterial genome organization

• A single, covalently closed circular DNA molecule.

• Condensation involving histone-like proteins into a


structure called a nucleoid.
• Further condensation into supercoiled domains.

64
Histone-like or nucleoid-associated proteins
– HU (heat-unstable protein)
– IHF (integration host factor)
– HNS (heat-stable nucleoid structuring)
– SMC (structural maintenance of chromosomes)

65
• Lateral gene transfer provides a source of genetic
material for bacteria.

• This allows for their rapid response to changing


environments.
– e.g. In Japan, a human gut bacterium has acquired a gene
from a marine bacterium that encodes an enzyme involved
in digesting the seaweed used to wrap sushi.

66
Plasmid DNA

• Small, double-stranded circular or


linear DNA molecules.
• Carried by bacteria, some fungi,
and some higher plants.
• Extrachromosomal, independent,
and self-replicating.

67
Plasmids from bacteria
– Small, covalently closed circular DNA molecules.

– Carriers of resistance to antibiotics.

– Vehicles for genetic engineering.

68
Archael genome organization

• One double-stranded circular DNA molecule (0.5 to


5.5 Mb)

• Some archaea have two distinct histones, each with a


single histone fold domain.

• 60 bp of DNA wraps around a histone tetramer.

• Some archaea use non-histone packaging proteins.

69
Archaeal histone
Eukaryotic histone

70
• The evolutionary origins of histones can be traced
back to the archael histones.

• A “doublet histone” in some archaea may


represent an intermediate in the transition from
archael to eukaryotic histones.

71
Viral genome organization

Bacteriophages and mammalian DNA viruses


– Double-stranded linear, single-stranded circular, and
double-stranded circular genomes.

– Model systems for molecular biology.

– Provide a cloned set of genes on a single DNA molecule.

72
Prokaryotic viruses (phages)
Bacteriophage (bacterial viruses)

– Genome typically consists of a single DNA molecule,


largely devoid of associated proteins.

– Commonly used bacteriophages in molecular biology:


Bacteriophage  (double-stranded linear genome)
M13 (single-stranded circular genome)

73
• Many recent advances in the study of bacteriophages
and viruses of archaea.

• Metagenomics: the sequencing of genomes of entire


populations within the “virosphere.”

• Isolation of many new virus-host systems of major


environmental importance.

• Many phage genes have no known functions or


homologs -> “ORFans”

• But, viruses and cellular organisms also share a


common gene pool by lateral gene transfer.
74
Mammalian DNA viruses

• Infect mammalian cells and make use of the host


machinery for their replication.
• Genomes come in a diversity of forms:
Human papilloma virus (circular, double-stranded)
Simian virus 40 (circular, double-stranded)
Adenovirus (linear, double-stranded)

75
• Little is known about how many mammalian DNA
viruses package their genome into the viral capsid.
• Some encode their own basic proteins.

• Simian virus 40 (SV40) uses host cell histones (H2A,


H2B, H3, and H4) (Fig. 5.23).

76
Outline

5.1 Introduction
5.2 Genome organization varies in different organisms
5.3 Packaging of the eukaryotic genome
5.4 The majority of the eukaryotic genome is noncoding
5.5 Lateral gene transfer in the eukaryotic genome
5.6 Prokaryotic and viral genome organization

77

You might also like