You are on page 1of 163

Gene Regulation

2021-2022 Pr. Bentayebi Kaoutar


Refresher + Background + Rational

• Biochemistry of nucleic acids

• Characteristics of double-stranded DNA

• Why they are different from proteins (proteins refold and aggregate)

• How the structures of the nucleic acids really are purposed for their function?

• Description of: (i) replication, (ii) transcription control, (iii) chromatin remodelling and (iv) translation
and post-translational modifications.

• How do we replicate the entire genome of organisms almost perfectly?

• What are the different types and roles of RNAs in a cell?

• How the gene expression is regulated (Epigenetics, chromatin tags and modifications, regulatory
sequences and proteins.
DNA:
DNA:
DNA:

Forms Phospho-ester
bond with Pentose
5’

4’ 1’ Regular numbers
3’ 2’

Prime numbers (‘) for carbons


This becomes incredibly important when we talk about putting together polymers
of DNA and the direction in which DNA is assembled in life.
Examples of common nucleotides:
DNA:
DNA:
Nucleotides:
DNA structure:
DNA structure:

The antiparallel orientation is very important for the thermodynamic sta


bility and the optimum hydrogen bonding interaction of all those bases
DNA structure:
Comparing DNA and RNA structure and function:
Double stranded DNA stability:
DNA replication:
How is DNA copied?:
How is DNA copied?:
Replication: Prokaryotic
Replication: Eukaryotic
DNA organisation:
Replication:
Replication:
Replication:
Replication:
Replication:
Replication:
Replication:
Replication:
DNA replication - Telomeres
DNA Replication – Video 1

https://www.youtube.com/watch?reload=9&time_continue=8&v=OjPcT1uUZiE&feature=emb_logo
DNA Replication – Video 2
DNA replication, speed and accuracy
From DNA to protein:
Transcription
Controlling transcription
Transcription bubble:
Transcription: Central Dogma (video)

https://www.youtube.com/watch?v=DA2t5N72mgw
Transcription control and RNA processing
Transcription control and RNA processing
Types of RNA:
messenger RNA (mRNA):

5% of total RNA in
the cell

7 methylguanosine
triphospate

• Recognition of mRNA in protein synthesis • Helps in transporting mRNA from nucleus to


• Prevent from 5’-exonucleases activities cytosol
• Prevent from 3’-exonucleases activities
messenger RNA (mRNA):

• UTRs play crucial roles in the post-


transcriptional regulation of gene
expression, including modulation of
the transport of mRNAs out of the
nucleus and of translation efficiency,
subcellular localization and stability.
UTRs may also play other roles, such
as the specific incorporation of the
modified amino acid selenocysteine at
UGA codons of mRNAs encoding
selenoproteins in a process mediated
by a conserved stem-loop structure in
the 3' UTR.

• The importance of UTRs in regulating


gene expression is underlined by the
finding that mutations that alter the
UTR can lead to serious pathology.

• Nucleotide patterns or motifs located in


5' UTRs and 3' UTRs can interact with
specific RNA-binding proteins.
https://pubs.rsc.org/en/content/articlehtml/2020/cs/d0cs00560f
Types of RNA:
Ribosomal RNA (rRNA):
Ribosomal RNA (rRNA):
Transfer RNA:

4 main arms Acceptor Arm

D Arm

Ribothymidine

Methyl group is added to


uracil to form thymine

Dihydrouridine

One of the double


bonds of the Uracil
base is reduced by
adding 2 Hydrogen
atoms
Graphical Abstract:
Abstract: Translation

https://www.youtube.com/watch?v=8Hsz_Vmcy-Y&list=PL3MAPgqN8JWib86aCRPB6hPcIMvJqR975&index=8
Types of RNA:
Regulatory non coding RNA (ncRNA):

• For many years the gene regulatory networks (GRN) were thought to be controlled exclusively by protein coding
genes until the discovery of functional non-coding RNA transcripts (ncRNAs) which form an integrated network to
shape the cellular environment during different developmental and metabolic processes (Kim and Sung, 2012).

• These ncRNAs are divided into two categories based on the transcript length—small ncRNAs (<200 nucleotides)
and long ncRNAs (>200 nucleotides) (Mercer et al., 2009).

• Currently, miRNAs are the best-characterized ncRNAs that are well conserved and repress the expression of target
mRNA by binding to its 3′ UTR (Majoros and Ohler, 2007).

• On the other hand, long ncRNAs (lncRNAs) constitute a less characterized but highly diverse class of ncRNAs.
lncRNAs are structurally similar to protein-coding genes as most of them are transcribed by RNA polymerase II, 5′
capped and polyadenylated at 3′ end (Bunch et al., 2016).

• Regardless of their close similarity to the protein-coding mRNAs, lncRNAs lack the potential to code functional
proteins. Although there are many lncRNAs that contain putative open reading frames and indeed some have been
re-classified to protein-coding genes

• Functionally, lncRNAs can either act in cis by regulating expression of neighbouring genes, or in trans, regulating
the expression of distant genes (Ulitsky and Bartel, 2013).

• lncRNAs regulate the gene expression by mending the 3-dimentional genome organization, mediating the binding
of chromatin modifying proteins or by sequestering the bound regulatory factors or miRNAs by acting as
molecular decoys or sponge (Morriss and Cooper, 2017).
lncRNA mechanisms of action:

(A) Guide: lncRNAs activate or


repress gene expression through
relocalization of regulatory factors.
(B) Scaffold: lncRNAs aid in the
formation of Ribonucleoprotein (RNP)
complexes.
(C) Decoy: lncRNAs remove the
regulatory factor bound to the genome
thereby terminating its regulation.
(D) lncRNAs sponge the miRNAs thus
inhibiting the miRNA mediated gene
repression.
(E) miRNA precursor lncRNAs function
as primary miRNA precursors that are
processed into mature miRNAs.
(F) lncRNA transcription from regulatory
regions of the genome initiate long
range gene regulation.

Sweta et al., 2019


lncRNA mechanisms of action (summary):

Salehi et al., 2017


miRNA mechanisms of action: 3. Meanwhile one of the strands
joins a group of proteins,
forming an miRNA-protein
complex or RNA-induced
silencing complex (RISC). The
other strand known as
passenger strand is usually
discarded.

2. An enzyme called Dicer


1. Protein called trims the pri-miRNA and
exportin-5 transports removes the hairpin loop,
a hairpin primary leaving a double stranded
miRNA out of the mi-RNA duplex molecule
nucleus
miRNA mechanisms of action:

• In plant cells, the miRNA is usually perfectly complementary


to its target mRNA molecule. The miRNA will bond with it, and
cause the mRNA to break down.

• In animal cells, the microRNA nucleotides typically don’t pair


up with the mRNA nucleotides perfectly. Their base pairing
often has base mismatches pattern.

• The miRNA-protein complex’s presence, blocks translation as


well as speeding up deadenylation (breakdown of the Poly-A
tail), which causes the mRNA to be degraded sooner and
translated less.
Small nucleolar RNA (snoRNA): snoRNA mechanisms of action:

• Small RNAs of 60–300 nucleotides


• Predominantly found in the nucleolus
• Most snoRNAs function as guide RNAs for the
post-transcriptional modification of ribosomal
RNAs (post-transcriptional synthesis of O-
methylated nucleotides and pseudouridines in
rRNA) and some spliceosomal RNAs, with a few
others involved in nucleolytic processing of the
original rRNA transcript
• These post-transcriptional modifications are
important for the production of efficient and
accurate ribosomes.
• snoRNA is transcribed from introns of pre-mRNA
rather than exon, demonstrating that introns can
code for functional RNA.
The two major classes of snoRNAs are involved in two different types of post-transcriptional
modification:

• C/D box snoRNAs define the target sites for 2′-O-ribose methylation; whereas, H/ACA box snoRNAs
define the target sites for pseudouridylation.
• C/D box snoRNAs and H/ACA box snoRNAs differ in their overall structure, with the classical features of
each directly correlated with the binding of specific proteins to form small nucleolar ribonucleoprotein
(snoRNP) complexes, which modify the appropriate targets.
• In both cases, snoRNA guide sequences hybridize specifically to the relevant sequence in the rRNA, and
the associated protein complexes then carry out the appropriate modification on the nucleotide that is
identified by the snoRNAs.

• small Cajal body-specific RNAs (scaRNAs), are a class of snoRNA which accumulate in small membraneless
subcompartments in the nucleus (Cajal bodies), instead of the nucleolus. They are involved in the post-
transcriptional modification of small nuclear RNAs (snRNAs).
• ScaRNAs contain Cajal body localization signals, but otherwise these RNAs are structually similar to
snoRNAs and a few contain both C/D and H/ACA boxes.
• Cajal bodies (CBs) or coiled bodies, are spherical nuclear bodies of 0.3–1.0 µm in diameter found in
the nucleus of proliferative cells like embryonic cells and tumor cells, or metabolically active cells
like neurons. CBs are membrane-less organelles and largely consist of proteins and RNA.
Classes and Genomic
Organisation of Small Nucleolar
RNAs (snoRNAs):

(A) snoRNA classes and functions.


Conserved consensus sequences
(boxes) are shown as rectangles.
2’ -O-Me: box C/D-dependent
methylation; ᴪ: box H/ACA-
dependent pseudouridylation. (B)
Genomic organisation of snoRNA
genes changes with increasing
organism complexity. (C)
Alternative splicing regulates
expression of snoRNA genes
located within the same host
molecule. snoRNA, which partially
overlaps with Exon 2, is not
expressed when all exons are
spliced. Exon 2 skipping generates
an snoRNA-containing lariat, the
processing of which releases all
three snoRNAs.

Kufel & Grzechnik, 2018


Functions of scaRNA:

The functions of scaRNAs are


diverse and can be largely
categorized into modifications of
other RNAs or formation of
derivatives that in turn perform
various functions. Modifications of
RNAs generally occur via two
mechanisms: 2′-O-methylation by
box C/D scaRNAs and
pseudouridylation by box H/ACA
scaRNAs. However, in the case of
mRNA, modification occurs
through the regulation
of alternative splicing. The 3’-end
of both box C/D and H/ACA RNAs
can be processed into miRNAs.
With the active enzyme
of telomerase RNA
component (TERC), the 3′-hairpin
loop of H/ACA RNA is contained Thuy et al., 2018
within the structure.
Small interfering RNA (siRNA) mechanisms of action:

• In 1998, and at the time of the completion of the Caenorhabditis elegans genome project, Andrew Fire and Craig Mello
described a new technology that was based on the silencing of specific genes by double-stranded RNA (dsRNA); a
technology they called RNA interference (RNAi)

• Fire, Mello and colleagues showed that, in C. elegans, the presence of just a few molecules of dsRNA was sufficient to
almost completely abolish the expression of a gene that was homologous to the dsRNA

• Scientists started using RNAi non only to elucidate gene function, but also to develop antiviral therapeutics.

• The generation of sequence-specific silencing agent was the first step to investigate the RNAi mechanism of action. A strong
candidate for this agent was a special class of short RNAs that was originally reported by Andrew Hamilton and David
Baulcombe. They found that Arabidopsis plants undergoing virus-induced- gene silencing contained 21–25-nucleotide (nt)
long RNAs that were complementary to both strands of the silenced gene and that had been processed from a long dsRNA
precursor.

• The processing of long dsRNAs to 21–23-nt RNAs was recapitulated in vitro.

• The cloning and sequencing of these RNAs revealed that they had a very specific structure: 21–23-nt dsRNAs.

• The evidence that these short RNAs determined RNAi specificity came from studies in Drosophila, in which small RNAs that
were isolated from cells undergoing silencing were shown to be sufficient to induce specific silencing in
naive Drosophila embryo lysates and S2 cells. In addition, when synthetic 21- and 22-nt RNA duplexes were added to the
lysate they were able to guide efficient sequence-specific mRNA degradation. These small RNAs were named short
interfering RNAs (siRNAs).
What is the difference between siRNA and miRNA?

Lam et al., 2015


What is the difference between siRNA and miRNA?

Lam et al., 2015


Generation and action of siRNAs and miRNAs - Video

https://www.youtube.com/watch?v=5YsTW5i0Xro&feature=emb_imp_woyt
Short hairpin RNA or small hairpin (shRNA):

A short hairpin RNA or small hairpin


RNA (shRNA/Hairpin Vector) is an
artificial RNA molecule with a
tight hairpin turn that can be
used to silence target gene
expression via RNA interference
(RNAi). Expression of shRNA in
cells is typically accomplished by
delivery of plasmids or through viral
or bacterial vectors.
Small nuclear RNA (snRNA):

• 150 nucleotides
• Post-transcriptional modifications of mRNA
Classes of snRNA:

Uridine is a
pyrimidine
nucleoside
Spliceosome:

• Nuclear pre-mRNA splicing is catalyzed by the spliceosome, a multi ribonucleoprotein (RNP) complex.
• Two unique spliceosomes coexist in most eukaryotes: the U2-dependent spliceosome, which catalyzes the removal
of U2-type introns, and the less abundant U12-dependent spliceosome, which is present in only a subset of
eukaryotes and splices the rare U12-type class of introns
mRNA splicing:

https://www.youtube.com/watch?v=YgmoHtLGb5c&list=PL3MAPgqN8JWib86aCRPB6hPcIMvJqR975&index=6
Spliceosome:

https://www.youtube.com/watch?v=OfeYFF85u-U
SUMMARY:

Introns are removed from the pre-mRNA by the spliceosome, a


complex of five small nuclear ribonucleic proteins, named U1, U2,
U4, U5, and U6, that are composed of a combination of proteins
and RNA. In addition, the ends of the mRNA are stabilized, often by
the addition of a 7-methylguanosine cap at the 5′ end
and polyadenylation (the addition of 150–200
adenosine nucleotides, a poly[A] tail) at the 3′ end. The mature
mRNA is exported from the nucleus by the nuclear RNA export
factors (eg, Nuclear RNA Export Factor 1), which bind to the
poly(A) tail.
PIRNA:

PIWI interacting RNA: Regulatory proteins responsible for stem cell and
germ cell differentiation

• piRNA is 21-35 nucleotides

• Function:

• Silence transposable elements also called DNA transposons or jumping genes


• Regulate gene expression
DNA cut-and-paste transposition

https://www.youtube.com/watch?v=XYZHMGUGq6o&list=PL3MAPgqN8JWib86aCRPB6hPcIMvJqR975&index=15
Mechanism of silencing transposable elements:

A.Post-translational gene silencing B.Translational gene silencing

Methyltransferase

caf1 has 3'-exoribonuclease


activity with a preference for
poly(A) RNAs
DNA is compacted into chromatin:
The nucleosome = DNA + Histone:
DNA is compacted into chromatin:

https://www.youtube.com/watch?v=4Z4KwuUfh0A&t=197s
DNA is compacted into chromatin:

https://www.youtube.com/watch?v=gbSIBhFwQ4s
DNA is compacted into chromatin: nucleosomes
DNA is compacted into chromatin: 30nm fiber
DNA is compacted into chromatin: 30nm fiber
DNA is compacted into chromatin: 30nm fiber

• Most chromosomal DNA chains within


the interphase nucleus are believed to be
held on a scaffold or backbone structure
made from various proteins, with loops
of between 20 and 200 kb.

• The scaffold, as well as permitting


further compaction, serves to bring the
DNA together in organised regions.

• DNA topoisomerases is one of the many


different protein components of these
scaffolds.

• Loops can undergo decondensation, as


a result of histone modification and/or
topoisomerase action, when access is
required by the cell (lower part of figure).
DNA is compacted into chromatin: metaphase chromosome
Schematic representations of
chromosome structure during the
cell cycle
• In G1, TADs are insulated from one another and occupy
distinct nuclear space and compartments.

• During S-phase, DNA is replicated at specific times, from early


to late replicating domains indicated by proportion of
replicated DNA in the TAD.

• Once in M-phase, the chromatin is highly compacted with TAD


structure barely identifiable and abundant very-long-range
contacts emerge between distant TADs (e.g. compare the
relationship between the orange and turquoise TADs marked
with stars in the zoom-out of G1-phase to the M-phase).

• (B) Comparison of the structure of mitotic chromosomes in


yeast (for simplicity, only individual sister chromatids are
shown). Saccharomyces cerevisiae chromosome arms are
compacted by cohesin compared to Schizosaccharomyces
pombe where condensin is required. The rDNA locus
of S. cerevisiae is brought into proximity of the centromere by
condensin, which is not required at other loci. Barrington et al., 2017
topologically associated domain (TAD)
Further background on nuclear organisation

https://en.wikipedia.org/wiki/Nuclear_organization
Chromatin is classified into two groups: heterochromatin and euchromatin

• Heterochromatin is such part of the


chromosomes, which is tightly
packed, genetically inactive, and
found at the periphery of the
nucleus.

• Euchromatin is an uncoiled packed


form of chromatin, genetically
active and found at the inner part of
the nucleus.

• When one observes the non-


dividing cells of the nucleus under
the light microscope, it shows two
regions, on the ground of the
concentration or the intensity of Emil Heitz defined these two
staining. The dark stained regions terms in the year 1928.
are said as heterochromatin and
light stained areas are called
euchromatin.
Chromatin is classified into two groups: heterochromatin and euchromatin
Constitutive heterochromatin
Example: X chromosome inactivation

• Epigenetic marks
modulate euchromatin
and heterochromatin
function

• Heterochromatin has
several functions:
(i)Gene silencing;
(ii)Structural integrity
of the genome

https://www.youtube.com/watch?v=mHak9EZjySs&t=361s
Epigenetic modifications
CG methylation across species:
Lineage restriction of human developmental potency

• Specific chromatin patterns and


epigenetic marks can be
observed during human
development since they are
responsible for controlling
transcriptional activation and
repression of tissue-specific and
pluripotency-related genes,
respectively.

• Global increases of
heterochromatin marks and DNA
methylation occur during
differentiation.

https://stemcellres.biomedcentral.com/articles/10.1186/scrt83
Chemical modifications of histone:
Chemical modifications of the histones can take a number of different forms. The most important ones
are: Lysine acetylation; Serine, threonine, and tyrosine phosphorylation, and Lysine and arginine
methylation, which can take the form of mono-, di-, or trimethylation.
DNA methylation:

• DNA methylation occurs almost exclusively at cytosines that are followed by guanines. And
this is called a CpG dinucleotide methylation. CpG is shorthand for 5'—C—phosphate—G—3',
that is, cytosine and guanine separated by only one phosphate group.

• The CpG sequence motif is sometimes also called “HpaII tiny fragments” island or HTF island
and is a short DNA fragment of ~1000-2000 bp long, usually found associated with upstream
sequence regions of transcriptionally active genes and is characterized by the relatively rare
CpG dinucleotide that occurs unmethylated.

• Why methylation occurs almost exclusively at CpG dinucleotides?


DNA methylation:
• Symmetric methylation indicates that 5-mC is present in CG context in the antiparallel
strands of DNA, and the methylation pattern can be truthfully reproduced during DNA
replication.

• In the human genome, >80% of the cytosine present in CG context is methylated, which
indicates ubiquitous methylation landscape of the genome.

• Modified cytosines have long been known to act as hotspots for mutations due to the
high rate of spontaneous deamination of this base to thymine, resulting in a G/T
mismatch. This will be fixed as a C→T transition after replication if not repaired by the
base excision repair (BER) pathway or specific repair enzymes dedicated to this purpose.
DNA methylation:

• The deamination of cytosine leads to


the production of uracil, which can
base pair with adenine.

• Uracil is not normally found in DNA


and so can easily be recognized and
removed by repair systems in the cell.
If uracil were a normal component of
DNA, then recognizing the products of
cytosine deamination would be more
difficult, and a gradual loss of
cytosines from the DNA would be
expected over evolutionary time.
DNA methylation:
• The rate of deamination is highest for cytosine of
the four standard nucleotides and is estimated to
occur in one of every 107 cytosine residues per day.

• The danger of having a naturally occurring base


generated by spontaneous deamination is illustrated
in the case of 5’-methyl-cytosine (5meC).

• Methylation of cytosine is a common post-


replicative modification of DNA in both bacteria and
eukaryotes and occurs at the 5’ position of the
pyrimidine ring.

• 5hmC is the first oxidative product in the active


demethylation of 5-methylcytosine (5mC). The three
Ten-eleven translocation (TET) enzymes oxidize
each step in the demethylation of 5mC. 5mC is first
converted to 5 hydroxymethylcytosine (5hmC), then
5-formylcytosine (5fC), then 5-carboxylcytosine
(5caC), each by TET1-3.
DNA methylation:

• DNA methyltransferases (DNMTs), including


DNMT1, DNMT3a and DNMT3b, catalyze the
methylation of CpG dinucleotides by addition of a
methyl group from S-adenosyl-L-methionine to the
5’ carbon position of cytosine. DNMTs are
responsible for setting and maintaining DNA
methylation patterns.

• Transcriptional repressors bind preferentially to


methylated CpGs impeding access to gene
regulatory regions

• Methylated CpGs prevent binding of specific


transcription factors to gene regulatory regions
thereby inhibiting transactivation.
DNA methylation:

• CpG islands surround the promoters of housekeeping


genes which encode enzymes involved in essential
metabolic pathways such as glycolysis and others.

• There is increasing experimental evidence that


indicate that the state of methylation of CpG islands
affects gene expression.

• The methylation of cytosine plays a crucial role in the


regulation of chromatin stability. For example,
recruitment of histone deacetylases (HDACs) by
methyl-CpG binding protein (MeCP2) and methyl-CpG
binding domain protein 2 (MBD2), alters the
chromatin’s conformation and makes promoter
regions less accessible to transcriptional activators.
Exemple:

Methyl-CpG-binding domain protein 1


(MBD1) is highly related to DNA
methylation. Its MBD domain recognizes
and binds to methylated CpGs. This binding
allows it to trigger methylation of H3K9
and results in transcriptional repression.
The CXXC3 domain of MBD1 makes it a
unique member of the MBD family due to its
affinity to unmethylated DNA. MBD1 acts
as an epigenetic regulator via different
mechanisms, such as the formation of the
MCAF1/MBD1/SETDB1 complex or the
MBD1-HDAC3 complex. As methylation
status always changes along with
carcinogenesis or neurogenesis, MBD1 with
its interacting partners, including proteins
and non-coding RNAs, participates in
normal or pathological processes and
functions in different regulatory systems.
Because of the important role of MBD1 in
epigenetic regulation, it is a good candidate https://www.researchgate.net/figure/Structure-of-methyl-CpG-binding-domain-
as a therapeutic target for diseases. protein1-MBD1-and-the-basic-mechanism-of-MBD1_fig1_273786711
Oncogenesis
Histone acetylation:

• Histone acetylation involves the covalent


addition of an acetyl group to lysine. Because
of its –NH2 group, lysine is normally a
positively charged amino acid, which binds
strongly to the negatively charged DNA
molecule.

• The addition of the acetyl group neutralizes


this positive charge and hence reduces the
binding between histones and DNA, leading
to a more open structure which is more
accessible to the transcriptional machinery.

• Histone acetylation therefore leads to


transcriptional activation.
Histone acetylation:

• Histone acetylation is mediated via histone acetyl transferase (HAT)


enzymes which are divided in three subfamilies: GNAT, MYST, and
p300/CBP. The removal of the acetyl group, on the other hand, is mediated
by histone deacetylase enzymes.

• In mammals, four different HDAC families are known: the zinc dependent
classes I, II, and IV and the NAD-dependent class III (which is also known
as the sirtuin family).

• As an example of the interaction between the different epigenetic


mechanisms, the above mentioned MeCP2, when binding to 5mC, can
recruit HDAC to actively deacetylase lysine molecules, which would restore
the original strong binding of positively charged lysine to DNA and reduce
transcriptional activity. In addition to this interaction between histone
acetylation and DNA methylation, interactions have been found between
histone acetylation and histone phosphorylation and histone methylation.
Chemical modifications of histone:
Chemical modifications of the histones can take a number of different forms. The most important ones
are: Lysine acetylation; Serine, threonine, and tyrosine phosphorylation, and Lysine and arginine
methylation, which can take the form of mono-, di-, or trimethylation.
DNA methylation:
Table 1. Structural and functional classification of genomic DNA sequence features (a)

1.Structural and functional features of genes


1.5′ and 3′ flanking
2.Promoter
3.Exonic
1.Untranslated regions (UTRs)
1.5′ UTR
2.3′ UTR
2.Exonic protein‐coding
4.Splicing
1.5′ and 3′ splice sites
2.Splicing enhancer and splicing silencer
3.Other regulatory
5.Transcription control elements
1.Enhancer
2.Repressor
3.Coregulator/modifier
4.Attenuator
5.Insulator
Table 1. Structural and functional classification of genomic DNA sequence features (b)

2.Structural and functional features of chromosomal DNA


1.Matrix/scaffold attachment region
2.Origin of replication
1.Replicator
3.Centromere
4.Telomere
5.CpG island
6.Nucleosome phasing elements
7.Pseudogene/gene fragment
8.Repetitive elements
1.Unique or low‐copy number repetitive element
2.Moderate to highly abundant repetitive element
3.Simple repeat expansions (e.g. single base, doublet, triplet)
4.Tandemly repeated or clustered repeats
5.Satellite DNA
1.Minisatellite
2.Microsatellite
3.Megasatellite
9.Interspersed repetitive elements
Table 1. Structural and functional classification of genomic DNA sequence features (c)

3. Retroposons
1.SINEs (short interspersed elements)
1.Alu
2.MIR (mammalian interspersed repeats)
2.LINEs (long interspersed elements): LINE1, LINE2
1.RLEs (retrovirus‐like elements)
2.HERVs (human endogenous retroviruses)
3.MaLRs (mammalian apparent LTR‐retrotransposons)
4.Others
3.DNA transposons: mariner, others
2.Unclassified elements
https://www.youtube.com/watch?v=7Hk9jct2ozY
REMEMBER:

• Methylation at the C5 position of the cytosine in a CpG dinucleotide is an epigenetic modification


associated with repressive chromatin structures

• CpG islands, regions characterized by a high content of CpG dinucleotides, are mostly
hypermethylated in normal somatic cells

• CpG islands, regions characterized by a high content of CpG dinucleotides, are mostly
hypormethylated in the promoters of tumor suppressors in cancer cells

• Modifications on DNA that alter its physical properties will likely affect the dynamics of
nucleosome assembly.

• CpG methylation induces nucleosome compaction and DNA topology change likely due to
rigidified DNA upon methylation
Table 1. Structural and functional classification of genomic DNA sequence features (a)

1.Structural and functional features of genes


1.5′ and 3′ flanking
2.Promoter
3.Exonic
1.Untranslated regions (UTRs)
1.5′ UTR
2.3′ UTR
2.Exonic protein‐coding
4.Splicing
1.5′ and 3′ splice sites
2.Splicing enhancer and splicing silencer
3.Other regulatory
5.Transcription control elements
1.Enhancer
2.Repressor
3.Coregulator/modifier
4.Attenuator
5.Insulator
Table 1. Structural and functional classification of genomic DNA sequence features (b)

2.Structural and functional features of chromosomal DNA


1.Matrix/scaffold attachment region
2.Origin of replication
1.Replicator
3.Centromere
4.Telomere
5.CpG island
6.Nucleosome phasing elements
7.Pseudogene/gene fragment
8.Repetitive elements
1.Unique or low‐copy number repetitive element
2.Moderate to highly abundant repetitive element
3.Simple repeat expansions (e.g. single base, doublet, triplet)
4.Tandemly repeated or clustered repeats
5.Satellite DNA
1.Minisatellite
2.Microsatellite
3.Megasatellite
9.Interspersed repetitive elements
Table 1. Structural and functional classification of genomic DNA sequence features (c)

3. Retroposons
1.SINEs (short interspersed elements)
1.Alu
2.MIR (mammalian interspersed repeats)
2.LINEs (long interspersed elements): LINE1, LINE2
1.RLEs (retrovirus‐like elements)
2.HERVs (human endogenous retroviruses)
3.MaLRs (mammalian apparent LTR‐retrotransposons)
4.Others
3.DNA transposons: mariner, others
2.Unclassified elements
Lineage restriction of human developmental potency

• Specific chromatin patterns and


epigenetic marks can be
observed during human
development since they are
responsible for controlling
transcriptional activation and
repression of tissue-specific and
pluripotency-related genes,
respectively.

• Global increases of
heterochromatin marks and DNA
methylation occur during
differentiation.

https://stemcellres.biomedcentral.com/articles/10.1186/scrt83
Remember:
the International Union of Pure and Applied
Chemistry (IUPAC) nucleotide alphabet
Transcription factors:

• The discovery of transcription factors – Key


molecules that regulate the use of genetics
information in the genome.

• Transcription factors are involved in many


fundamental aspects of biology including
embryonic development, cellular
differentiation and cell fat.
Transcription factors:
Transcription factors:

A transcription factor is a protein that binds to a specific


DNA sequence and controls how that segment of DNA is
read.

Remember that all these stretches are nucleotide


sequences, and they can mutate.
•promoter - facilitates gene transcription when bound to
a transcription factor
•enhancer - enhances transcription level/rate when
bound to a transcription-factor like protein
•repressor - represses transcription level/rate when
bound to a transcription-factor like protein

Mutations in any of these regions can increase the


production of a gene product, decrease it, or shut it off
(as compared to the unmutated version).
Enhancer:

• Enhancer sequences are regulatory DNA sequences that, when bound by specific proteins called
transcription factors, enhance the transcription of an associated gene.

• Transcription factors can bind to enhancer sequences located upstream or downstream from an
associated gene.
Protein Cofactors
Enhancer: empower TD for High-
Affinity DNA Binding

• General transcription factors (gTF) are involved in the formation of the pre-initiation complex during transcription.

• Specific transcription factors (sTF) stimulate or repress transcription by recruiting intermediary proteins such
as cofactors (CoF) that allow efficient recruitment of the preinitiation complex and RNA polymerase.
Enhancer:

Transcription factors can bind to one or several enhancer sequences to activate the transcription of a
specific gene.
Enhancer:

Enhancer can be active,


accessible to transcription
factors, they can be primed
accessible for some but not
all the relevant transcription
factors or they can be coiled
(poised) not accessible to
transcription factors.
Enhancer:

Enhancer activity differ between cells. For example, in nerve cell, only enhancer A is needed to activate the
transcription of gene X. Whereas, in intestinal cell, both enhancers A and B are needed to activate the
transcription of the same gene X
Transcription factors:

•As a zygote develops and undergoes its orderly cleavages, the DNA in each new cell is repackaged and
modified: some genes turn on (are expressed)

•some genes turn off (are not expressed)

•some genes code for proteins that turn other genes on or off

•these proteins are called transcription factors because they affect the cell's ability to transcribe DNA
into RNA (transcription)

•such transcriptional control can be either positive or negative

As the genetic instructions guiding a vertebrate embryo's development change in each new cell, the
cells themselves follow those instructions, and are modified...

•some change form and function to become specific types of cells


Transcription factors:

•later, specialized cells clump together to form organs,


•many cells migrate from one location to another, and contribute to tissues and organs far
form where the cell first appeared,
•still other cells die via programmed death (apoptosis) to facilitate proper formation of
structures
All of this is governed by instructions on the DNA. Each cell has the same genome, but
different genes are active and inactive in each type of cell.
Transcription factors:
Transcription factors:
Transcription factors:
Transcription factors:
Transcription factors:
Transcription factors:
How do the right genes get expressed in the right cells and at the right time?

RNA polymerase
Transcription factors

TTTCAC TATAA

DNA
GCCGCC Gene to be
transcribed
Regulatory Promoter region
• Transcription factors recruit and instructs RNA pol II to initiate RNA synthesis at specific genes by recognising
and binding to DNA elements called promoters.

• Conserved consensus motifs have been predicted for transcription factor binding across the human genome,
and empirical transcription factor binding sites (TFBS) have been determined biologically using the genome-
wide technique which couples chromatin immunoprecipitation and high throughput sequencing (ChIP-seq)
Transcription factors:

A binding site in DNA typically


indicates where certain proteins
bind. They are found at specific
locations in the genome.

Molecules such as proteins will


bind to these sites in order to
regulate the transcription. These
proteins are called Transcription
Factor (TF) and their binding sites
are respectively named
Transcription Factor Binding Site
(TFBS).
Transcription factors:

Transcription factor binding sites


relative to the transcription start
site.

The black line represents the DNA.

Different types of binding sites are


represented by the orange boxes.

Binding occurs upstream of


transcription initiation.

Multiple transcription factors can


compose cis-regulatory modules
Transcription factors:
Transcription factors: GTF
• Transcription factor TFIIA is a nuclear protein involved
in the RNA polymerase II-dependent transcription
of DNA. TFIIA is one of several general
(basal) transcription factors (GTFs) that are required
for all transcription events that use RNA polymerase
II. Other GTFs include TFIID, a complex composed of
the TATA binding protein (TBP) and TBP-associated
factors (TAFs), as well as the factors TFIIB, TFIIE, TFIIF,
and TFIIH. Together, these factors are responsible
for promoter recognition and the formation of a
transcription preinitiation complex (PIC) capable of
initiating RNA synthesis from a DNA template.

• TFIIA interacts with the TBP subunit of TFIID and aids


in the binding of TBP to TATA-
box containing promoter DNA.

• TFIIH is a general transcription factor that acts to


recruit RNA Pol II to the promoters of genes. It
functions as a helicase that unwinds DNA.
Transcription factors: TF II D
•Transcription factor II D (TFIID) is
one of several general transcription
factors that make up the RNA
polymerase II preinitiation complex.

•Coordinates the activities of


more than 70 polypeptides
required for initiation of
transcription by RNA polymerase
II

•Binds to the core promoter to


position the polymerase properly

•Serves as the scaffold for


assembly of the remainder of the
transcription complex

•Acts as a channel for regulatory


signals
Transcription factors:
Transcription factors:

https://www.youtube.com/watch?v=XzVXhemtwmA
Transcription regulation:

https://www.youtube.com/watch?v=Gs3llepaaB0
Transcription factors specificity:

• Several studies have indicated that primary TF (Transcription Factors) DNA binding specificity
evolves slowly, and is extremely conserved between mammalian species.

• The origin of most structural families of TFs dates well before the emergence of mammals,
and even predates the divergence of vertebrates and invertebrates. Within each TF family, DNA
binding specificity has also diverged considerably, with many families having 2–10 different
subclasses displaying different primary binding specificities.

• Databases collecting TF binding specificity information, such as TRANSFAC (Matys et al., 2006)
and Jaspar (Bryne et al., 2008) contain a large number of specificities from different species.
However, the data are generally derived using different methods in different laboratories, and
therefore it is very difficult to separate method-specific biases from real differences in binding
specificity, particularly in cases where the differences are not very pronounced.

You might also like