Professional Documents
Culture Documents
• Why they are different from proteins (proteins refold and aggregate)
• How the structures of the nucleic acids really are purposed for their function?
• Description of: (i) replication, (ii) transcription control, (iii) chromatin remodelling and (iv) translation
and post-translational modifications.
• How the gene expression is regulated (Epigenetics, chromatin tags and modifications, regulatory
sequences and proteins.
DNA:
DNA:
DNA:
Forms Phospho-ester
bond with Pentose
5’
4’ 1’ Regular numbers
3’ 2’
https://www.youtube.com/watch?reload=9&time_continue=8&v=OjPcT1uUZiE&feature=emb_logo
DNA Replication – Video 2
DNA replication, speed and accuracy
From DNA to protein:
Transcription
Controlling transcription
Transcription bubble:
Transcription: Central Dogma (video)
https://www.youtube.com/watch?v=DA2t5N72mgw
Transcription control and RNA processing
Transcription control and RNA processing
Types of RNA:
messenger RNA (mRNA):
5% of total RNA in
the cell
7 methylguanosine
triphospate
D Arm
Ribothymidine
Dihydrouridine
https://www.youtube.com/watch?v=8Hsz_Vmcy-Y&list=PL3MAPgqN8JWib86aCRPB6hPcIMvJqR975&index=8
Types of RNA:
Regulatory non coding RNA (ncRNA):
• For many years the gene regulatory networks (GRN) were thought to be controlled exclusively by protein coding
genes until the discovery of functional non-coding RNA transcripts (ncRNAs) which form an integrated network to
shape the cellular environment during different developmental and metabolic processes (Kim and Sung, 2012).
• These ncRNAs are divided into two categories based on the transcript length—small ncRNAs (<200 nucleotides)
and long ncRNAs (>200 nucleotides) (Mercer et al., 2009).
• Currently, miRNAs are the best-characterized ncRNAs that are well conserved and repress the expression of target
mRNA by binding to its 3′ UTR (Majoros and Ohler, 2007).
• On the other hand, long ncRNAs (lncRNAs) constitute a less characterized but highly diverse class of ncRNAs.
lncRNAs are structurally similar to protein-coding genes as most of them are transcribed by RNA polymerase II, 5′
capped and polyadenylated at 3′ end (Bunch et al., 2016).
• Regardless of their close similarity to the protein-coding mRNAs, lncRNAs lack the potential to code functional
proteins. Although there are many lncRNAs that contain putative open reading frames and indeed some have been
re-classified to protein-coding genes
• Functionally, lncRNAs can either act in cis by regulating expression of neighbouring genes, or in trans, regulating
the expression of distant genes (Ulitsky and Bartel, 2013).
• lncRNAs regulate the gene expression by mending the 3-dimentional genome organization, mediating the binding
of chromatin modifying proteins or by sequestering the bound regulatory factors or miRNAs by acting as
molecular decoys or sponge (Morriss and Cooper, 2017).
lncRNA mechanisms of action:
• C/D box snoRNAs define the target sites for 2′-O-ribose methylation; whereas, H/ACA box snoRNAs
define the target sites for pseudouridylation.
• C/D box snoRNAs and H/ACA box snoRNAs differ in their overall structure, with the classical features of
each directly correlated with the binding of specific proteins to form small nucleolar ribonucleoprotein
(snoRNP) complexes, which modify the appropriate targets.
• In both cases, snoRNA guide sequences hybridize specifically to the relevant sequence in the rRNA, and
the associated protein complexes then carry out the appropriate modification on the nucleotide that is
identified by the snoRNAs.
• small Cajal body-specific RNAs (scaRNAs), are a class of snoRNA which accumulate in small membraneless
subcompartments in the nucleus (Cajal bodies), instead of the nucleolus. They are involved in the post-
transcriptional modification of small nuclear RNAs (snRNAs).
• ScaRNAs contain Cajal body localization signals, but otherwise these RNAs are structually similar to
snoRNAs and a few contain both C/D and H/ACA boxes.
• Cajal bodies (CBs) or coiled bodies, are spherical nuclear bodies of 0.3–1.0 µm in diameter found in
the nucleus of proliferative cells like embryonic cells and tumor cells, or metabolically active cells
like neurons. CBs are membrane-less organelles and largely consist of proteins and RNA.
Classes and Genomic
Organisation of Small Nucleolar
RNAs (snoRNAs):
• In 1998, and at the time of the completion of the Caenorhabditis elegans genome project, Andrew Fire and Craig Mello
described a new technology that was based on the silencing of specific genes by double-stranded RNA (dsRNA); a
technology they called RNA interference (RNAi)
• Fire, Mello and colleagues showed that, in C. elegans, the presence of just a few molecules of dsRNA was sufficient to
almost completely abolish the expression of a gene that was homologous to the dsRNA
• Scientists started using RNAi non only to elucidate gene function, but also to develop antiviral therapeutics.
• The generation of sequence-specific silencing agent was the first step to investigate the RNAi mechanism of action. A strong
candidate for this agent was a special class of short RNAs that was originally reported by Andrew Hamilton and David
Baulcombe. They found that Arabidopsis plants undergoing virus-induced- gene silencing contained 21–25-nucleotide (nt)
long RNAs that were complementary to both strands of the silenced gene and that had been processed from a long dsRNA
precursor.
• The cloning and sequencing of these RNAs revealed that they had a very specific structure: 21–23-nt dsRNAs.
• The evidence that these short RNAs determined RNAi specificity came from studies in Drosophila, in which small RNAs that
were isolated from cells undergoing silencing were shown to be sufficient to induce specific silencing in
naive Drosophila embryo lysates and S2 cells. In addition, when synthetic 21- and 22-nt RNA duplexes were added to the
lysate they were able to guide efficient sequence-specific mRNA degradation. These small RNAs were named short
interfering RNAs (siRNAs).
What is the difference between siRNA and miRNA?
https://www.youtube.com/watch?v=5YsTW5i0Xro&feature=emb_imp_woyt
Short hairpin RNA or small hairpin (shRNA):
• 150 nucleotides
• Post-transcriptional modifications of mRNA
Classes of snRNA:
Uridine is a
pyrimidine
nucleoside
Spliceosome:
• Nuclear pre-mRNA splicing is catalyzed by the spliceosome, a multi ribonucleoprotein (RNP) complex.
• Two unique spliceosomes coexist in most eukaryotes: the U2-dependent spliceosome, which catalyzes the removal
of U2-type introns, and the less abundant U12-dependent spliceosome, which is present in only a subset of
eukaryotes and splices the rare U12-type class of introns
mRNA splicing:
https://www.youtube.com/watch?v=YgmoHtLGb5c&list=PL3MAPgqN8JWib86aCRPB6hPcIMvJqR975&index=6
Spliceosome:
https://www.youtube.com/watch?v=OfeYFF85u-U
SUMMARY:
PIWI interacting RNA: Regulatory proteins responsible for stem cell and
germ cell differentiation
• Function:
https://www.youtube.com/watch?v=XYZHMGUGq6o&list=PL3MAPgqN8JWib86aCRPB6hPcIMvJqR975&index=15
Mechanism of silencing transposable elements:
Methyltransferase
https://www.youtube.com/watch?v=4Z4KwuUfh0A&t=197s
DNA is compacted into chromatin:
https://www.youtube.com/watch?v=gbSIBhFwQ4s
DNA is compacted into chromatin: nucleosomes
DNA is compacted into chromatin: 30nm fiber
DNA is compacted into chromatin: 30nm fiber
DNA is compacted into chromatin: 30nm fiber
https://en.wikipedia.org/wiki/Nuclear_organization
Chromatin is classified into two groups: heterochromatin and euchromatin
• Epigenetic marks
modulate euchromatin
and heterochromatin
function
• Heterochromatin has
several functions:
(i)Gene silencing;
(ii)Structural integrity
of the genome
https://www.youtube.com/watch?v=mHak9EZjySs&t=361s
Epigenetic modifications
CG methylation across species:
Lineage restriction of human developmental potency
• Global increases of
heterochromatin marks and DNA
methylation occur during
differentiation.
https://stemcellres.biomedcentral.com/articles/10.1186/scrt83
Chemical modifications of histone:
Chemical modifications of the histones can take a number of different forms. The most important ones
are: Lysine acetylation; Serine, threonine, and tyrosine phosphorylation, and Lysine and arginine
methylation, which can take the form of mono-, di-, or trimethylation.
DNA methylation:
• DNA methylation occurs almost exclusively at cytosines that are followed by guanines. And
this is called a CpG dinucleotide methylation. CpG is shorthand for 5'—C—phosphate—G—3',
that is, cytosine and guanine separated by only one phosphate group.
• The CpG sequence motif is sometimes also called “HpaII tiny fragments” island or HTF island
and is a short DNA fragment of ~1000-2000 bp long, usually found associated with upstream
sequence regions of transcriptionally active genes and is characterized by the relatively rare
CpG dinucleotide that occurs unmethylated.
• In the human genome, >80% of the cytosine present in CG context is methylated, which
indicates ubiquitous methylation landscape of the genome.
• Modified cytosines have long been known to act as hotspots for mutations due to the
high rate of spontaneous deamination of this base to thymine, resulting in a G/T
mismatch. This will be fixed as a C→T transition after replication if not repaired by the
base excision repair (BER) pathway or specific repair enzymes dedicated to this purpose.
DNA methylation:
• In mammals, four different HDAC families are known: the zinc dependent
classes I, II, and IV and the NAD-dependent class III (which is also known
as the sirtuin family).
3. Retroposons
1.SINEs (short interspersed elements)
1.Alu
2.MIR (mammalian interspersed repeats)
2.LINEs (long interspersed elements): LINE1, LINE2
1.RLEs (retrovirus‐like elements)
2.HERVs (human endogenous retroviruses)
3.MaLRs (mammalian apparent LTR‐retrotransposons)
4.Others
3.DNA transposons: mariner, others
2.Unclassified elements
https://www.youtube.com/watch?v=7Hk9jct2ozY
REMEMBER:
• CpG islands, regions characterized by a high content of CpG dinucleotides, are mostly
hypermethylated in normal somatic cells
• CpG islands, regions characterized by a high content of CpG dinucleotides, are mostly
hypormethylated in the promoters of tumor suppressors in cancer cells
• Modifications on DNA that alter its physical properties will likely affect the dynamics of
nucleosome assembly.
• CpG methylation induces nucleosome compaction and DNA topology change likely due to
rigidified DNA upon methylation
Table 1. Structural and functional classification of genomic DNA sequence features (a)
3. Retroposons
1.SINEs (short interspersed elements)
1.Alu
2.MIR (mammalian interspersed repeats)
2.LINEs (long interspersed elements): LINE1, LINE2
1.RLEs (retrovirus‐like elements)
2.HERVs (human endogenous retroviruses)
3.MaLRs (mammalian apparent LTR‐retrotransposons)
4.Others
3.DNA transposons: mariner, others
2.Unclassified elements
Lineage restriction of human developmental potency
• Global increases of
heterochromatin marks and DNA
methylation occur during
differentiation.
https://stemcellres.biomedcentral.com/articles/10.1186/scrt83
Remember:
the International Union of Pure and Applied
Chemistry (IUPAC) nucleotide alphabet
Transcription factors:
• Enhancer sequences are regulatory DNA sequences that, when bound by specific proteins called
transcription factors, enhance the transcription of an associated gene.
• Transcription factors can bind to enhancer sequences located upstream or downstream from an
associated gene.
Protein Cofactors
Enhancer: empower TD for High-
Affinity DNA Binding
• General transcription factors (gTF) are involved in the formation of the pre-initiation complex during transcription.
• Specific transcription factors (sTF) stimulate or repress transcription by recruiting intermediary proteins such
as cofactors (CoF) that allow efficient recruitment of the preinitiation complex and RNA polymerase.
Enhancer:
Transcription factors can bind to one or several enhancer sequences to activate the transcription of a
specific gene.
Enhancer:
Enhancer activity differ between cells. For example, in nerve cell, only enhancer A is needed to activate the
transcription of gene X. Whereas, in intestinal cell, both enhancers A and B are needed to activate the
transcription of the same gene X
Transcription factors:
•As a zygote develops and undergoes its orderly cleavages, the DNA in each new cell is repackaged and
modified: some genes turn on (are expressed)
•some genes code for proteins that turn other genes on or off
•these proteins are called transcription factors because they affect the cell's ability to transcribe DNA
into RNA (transcription)
As the genetic instructions guiding a vertebrate embryo's development change in each new cell, the
cells themselves follow those instructions, and are modified...
RNA polymerase
Transcription factors
TTTCAC TATAA
DNA
GCCGCC Gene to be
transcribed
Regulatory Promoter region
• Transcription factors recruit and instructs RNA pol II to initiate RNA synthesis at specific genes by recognising
and binding to DNA elements called promoters.
• Conserved consensus motifs have been predicted for transcription factor binding across the human genome,
and empirical transcription factor binding sites (TFBS) have been determined biologically using the genome-
wide technique which couples chromatin immunoprecipitation and high throughput sequencing (ChIP-seq)
Transcription factors:
https://www.youtube.com/watch?v=XzVXhemtwmA
Transcription regulation:
https://www.youtube.com/watch?v=Gs3llepaaB0
Transcription factors specificity:
• Several studies have indicated that primary TF (Transcription Factors) DNA binding specificity
evolves slowly, and is extremely conserved between mammalian species.
• The origin of most structural families of TFs dates well before the emergence of mammals,
and even predates the divergence of vertebrates and invertebrates. Within each TF family, DNA
binding specificity has also diverged considerably, with many families having 2–10 different
subclasses displaying different primary binding specificities.
• Databases collecting TF binding specificity information, such as TRANSFAC (Matys et al., 2006)
and Jaspar (Bryne et al., 2008) contain a large number of specificities from different species.
However, the data are generally derived using different methods in different laboratories, and
therefore it is very difficult to separate method-specific biases from real differences in binding
specificity, particularly in cases where the differences are not very pronounced.