You are on page 1of 68

FROM CHEMISTRY TO BIOLOGY:

via the RNA World

Saurja DasGupta
Howard Hughes Medical Institute
Dept. of Molecular Biology, Massachusetts General Hospital
Dept. of Genetics, Harvard Medical School
The origin of life is a:
- scientific problem
- historical problem
4 - 3.8 billion years ago

3.5 billion years ago


3-8 - 3.5 billion years ago

Prebiotic CHEMISTRY

RNA World
Modern BIOLOGY

“The origin of life cannot be discovered but only reinvented” – Eschenmoser


The only way to solve the mystery is to do experiments

armchair
philosophy

“What I cannot create, I do not understand” - Feynman


Life is so complex and diverse:
Where do we even start?

A simple bacterial cell is complex


similar to

Last universal common ancestor (LUCA) Bacteria


Synthetic chemists break complex molecules
into simpler precursors using retrosynthetic logic

TAXOL – a complex natural product


Nicolaou, KC; Yang, Z; Liu, JJ; Ueno, H; Nantermet, PG; Guy, RK; Claiborne, CF;
Renaud, J; et al. (February 1994). "Total synthesis of taxol". Nature. 367 (6464): 630–4.

has to make chemical sense + commercially available starting material


TAXOL
Retrosynthetic analysis of a living cell
LIFE: a self-sustaining chemical system capable of Darwinian evolution
– NASA (Gerald Joyce)

genes = DNA
enzymes = proteins

gene
enzyme
compartment

~3.6 bya
phospholipid-based primitive cell (protocell)
compartments with ions channels that can undergo evolution

has to make biochemical sense + prebiotically available starting material


Message is transferred to
RNA

Genes are made of


DNA

Enzymes are made of


Proteins
Life didn’t always use the
DNA-RNA-protein blueprint

CENTRAL PROBLEM:
The ribosome can only make proteins by reading RNA (not DNA) : Need RNA to make proteins
Need protein enzymes (polymerases) to make RNA from DNA

Can both genes and enzymes be made of one biopolymer?

Minor problem:
proteins might be too complex to have spontaneously formed on early earth
Basic requirements for a genetic polymer

Genetic information stored Genetic information copied

ONLY nucleic acids contain information that


can be copied/propagated due to base-pairing
Proteins cannot be genetic polymers

Genetic information propagated

Kaddour, H.; Sahai, N. Synergism and Mutualism in Non-Enzymatic RNA Polymerization. Life 2014, 4, 598-620.
Base-pairing in RNA/DNA make it ideal for carrying genetic messages

nucleobase

phosphate

nucleotide

ribose

C/G/A/U

backbone

hydrogen bond
RNA
Hydrogen bonds are Nature’s molecular glue
Basic requirements for an enzymatic polymer

RNA can fold into ‘protein-like’ complex folds guided by base pairing

“Nature’s attempt to make


RNA do the job of proteins”
- Francis Crick

“RNA: a tape with a shape”


- Michael Yarus

Courtesy: Janet Iwasa


Looking at life around: GENES can be RNA

RNA genome

SARS-CoV2: genes made of RNA


Looking at life around: ENZYMES can be RNA

RNA enzymes or Ribozymes discovered accidentally

1989 : discovery
of RNA enzymes

Genes are made of


DNA
Crystal structure of the ribosome showed
no protein within ~20 Å of the active site
Ribosome = ribozyme 2009 : structure
All proteins are made by RNA of ribosome
Enzymes are made of
Proteins
GENES RNA ENZYMES
RNA
gene enzyme

RNA genome RNA enzyme


(coronavirus) Primitive cell (ribosome)

RNA WORLD HYPOTHESIS


Early life (~3.8 billion years ago) was likely based on RNA
Molecular fossils of the RNA WORLD

pre-tRNA_1 pre-tRNA_2 tRNA

DNA pre-mRNA mRNA protein


Molecular fossils of the RNA WORLD

pre-tRNA_1 pre-tRNA_2 tRNA

DNA pre-mRNA mRNA protein

splicing
(editing the
RNA message)

Eukaryotes spliceosome

Prokaryotes group I group II


Eukaryotes intron intron
Molecular fossils of the RNA WORLD
Ribonuclease P
ribosome (rRNA)
pre-tRNA_1 pre-tRNA_2 tRNA

DNA pre-mRNA mRNA protein

splicing
(editing the
RNA message)
miRNA gene
Eukaryotes spliceosome silencing

The RNA WORLD exists inside all of us !


Prokaryotes group I group II
intron intron siRNAs, lncRNAs, riboswitches, snoRNAs...
Retrosynthetic analysis of a living cell

3. Biochemistry problem

1. Synthetic chemistry problem


How does an RNA behave like an enzyme?

RNA

Primitive cell RNA enzyme (ribozyme) Short pieces of RNA (oligos) Monomer (nucleotide) ‘Feedstock’ molecules

2. Assembly problem
1. Synthetic chemistry problem
How to make nucleotides from scratch?

nucleobase

Classical synthetic strategy:


sugar
Make ribose sugar
Make nucleobases A, C, G, U
+
Put both parts together

Yadav, M.; Kumar, R.; Krishnamurthy, R. Chem. Rev. 2020, 120, 11, 4766–4805
How to make nucleotides from scratch?
Butlerow’s ‘Formose’ reaction Oro’s purine synthesis

1C alkaline pH
1C, 1N

5C

5C, 4N

PURINE PURINE
RIBOSE - Yields <1%
- High [HCHO] required
- Yields <1% (max 15%)
- Very high [HCN) required
Yadav, M.; Kumar, R.; Krishnamurthy, R. Chem. Rev. 2020, 120, 11, 4766–4805
How to make nucleotides from scratch?

Butlerow’s ‘Formose’ reaction Oro’s purine synthesis Coupling reaction fails

SUGAR NUCLEOBASE NUCLEOSIDE

Very low yields


Plethora of side products
Nucleobase rings are created after coupling with ribose
Carell pathway Sutherland pathway

atoms for nucleobase

atoms for nucleobase


atoms for ribose

pyrimidines purines pyrimidines

Yadav, M.; Kumar, R.; Krishnamurthy, R. Chem. Rev. 2020, 120, 11, 4766–4805
Purines and pyrimidines from a common precursor

Powner-Szostak pathway

pyrimidine purine
pathway pathway

But 8-oxo
purines made
2. Assembly problem
How to make oligonucleotides from nucleotides without enzymes ?

LG
How to make oligonucleotides from nucleotides without enzymes ?

LG
Nucleoside triphosphates (NTPs) are used
by polymerase enzymes to make RNA

BUT, pyrophosphate is not a sufficiently good LG for


the reaction to happen without enzyme catalysis.
NTPs would not have been RNA building blocks
before the emergence of ribozymes
Nucleosides activated with imidazoles are reactive
for non-enzymatic RNA polymerization

Screening for the best imidazole-derivative LG


primer primer

Winner: 2-aminoimidazole Efficient RNA polymerization WITHOUT enzyme


Li, L. et al. J. Am. Chem. Soc. 2017, 139, 1810−1813
2-aminoimidazole (2AI) synthesis is prebiotically plausible

cyanamide

glycoaldehyde

2-aminooxazole

cyanamide

Yadav, M.; Kumar, R.; Krishnamurthy, R. Chem. Rev. 2020, 120, 11, 4766–4805 glycoaldehyde

2-aminoimidazole

Fahrenbach, A. et al. J. Am. Chem. Soc. 2017, 139, 8780-8783


2-aminoimidazole is a likely byproduct in the
prebiotic nucleoside synthesis pathway
Nucleotides can be activated with 2AI prebiotically
Sutherland prebiotic activation
H2 N
NH2

acetaldehyde

Reactive for polymerization


Prebiotic feedstock Imidyol phosphate

Mariani, A. et al. J. Am. Chem. Soc. 2018, 140, 8657−8661

Fahrenbach prebiotic activation


coupling agent
synthesized in situ NH2

similar to CDI
Reactive for polymerization
Prebiotic feedstock

Yi R. et al. Chem. Comm. 2018, 54, 511−514


So far so good, but what about the problems?

Prebiotic syntheses of nucleotides

 Still too many side products: e.g., ribose stereochemistry is hard to control (ara instead of rib),
 Purine synthesis is still challenging

 Constrained by a specific sequence of reactions and availability of key components

Assembly of activated nucleotides

 Final yield of oligomer from monomer is still poor – long RNAs still inaccessible
 Not all sequences are copied efficiently – i.e. sequence bias, fidelity is not great

2AIpG,
OH 2AIpC CCCGCG
Fast reaction
GGGCGC GGGCGC
template
BUT
2AIpG,2AIpC,
OH 2AIpU,2AIpA
CUCGCG
In presence of all 4 nucleotides, chance or mismatches
GGGCGC GGGCGC
So far so good, but what about the problems?

Prebiotic syntheses of nucleotides

 Still too many side products: e.g., ribose stereochemistry is hard to control (ara instead of rib),
 Purine synthesis is still challenging

 Constrained by a specific sequence of reactions and availability of key components

Assembly of activated nucleotides

 Final yield of oligomer from monomer is still poor – long RNAs still inaccessible
 Not all sequences are copied efficiently – i.e. sequence bias, fidelity is not great

2AIpG,
OH 2AIpC CCCGCG
Fast reaction- good yields
GGGCGC GGGCGC
template
BUT
2AIpU,
OH 2AIpA UU
SLOW reaction- POOR yields
AAAUAU AAAUAU
So far so good, but what about the problems?

Prebiotic syntheses of nucleotides

 Still too many side products: e.g., ribose stereochemistry is hard to control (ara instead of rib),
 Purine synthesis is still challenging

 Constrained by a specific sequence of reactions and availability of key components

Assembly of activated nucleotides

 Final yield of oligomer from monomer is still poor – long RNAs still inaccessible
 Not all sequences are copied efficiently – i.e. sequence bias, fidelity is not great

 Hydrolysis of
2-aminoimidazole is a good LG = labile to hydrolysis

2AIpG,
Hydrolyzed = no LG = unreactive = inhibitor C
OH 2AIpC CC
pG, pC GGGCGC
GGGCGC from
template hydrolysis
Getting closer and closer to biology...
3. Biochemistry problem
How does RNA behave as enzymes?

Why is RNA a bad enzyme OR why did protein enzymes take over from ribozymes?

 Negative charged backbone – FOLDING difficult due to electrostatic


repulsion.
** Protein amide backbone is neutral.

 Only 4 chemically similar bases, none of whose pKa is near neutral pH.
**Proteins have >20 side chains: +ve, -ve, neutral, aromatic, aliphatic.
Histidine has a pKa ~7 = can mediate general acid/base catalysis.

What does RNA have to do to be an enzyme?


Case study: a ribozyme that ‘cuts’ RNA site-specifically
Protein enzymes that cut RNA:
RNases A, T1, P1, T2, S1, V1...

Substrate Product

1. How does the RNA enzyme bind the correct RNA substrate?

2. Substrate has 34 nt, 33 chemically identical bonds – how does the RNA enzyme know where to cut?

3. What is the chemical mechanism of catalytic RNA cleavage at the active site?
Electron density map: Suslov NB*, DasGupta S* et al., Nat. Chem. Biol., 2015, 11, 840-846
DasGupta S et al., J. Am. Chem. Soc., 2017, 139, 9591-9597
RNA enzyme

RNA enzyme has a complex 3D structure (not just a string of nucleotides)


Electron density map:
substrate bound to RNA enzyme

RNA enzyme has a complex 3D structure (not just a string of nucleotides)


Electron density to 3D structure
substrate bound to RNA enzyme

Substrate fits nicely into a cavity in the RNA enzyme


STEP 1: folding

backbone

This is where RNA is cut

The architecture places the substrate close to the important parts of the enzyme
STEP 1: folding

backbone

nucleotide
This is where RNA is cut

X-ray crystallography provides atomic level understanding


STEP 2: substrate
recognition and binding

Enzyme interacts with substrate intimately


STEP 2: substrate
recognition and binding

This is where
RNA is bound
A recognition module in the enzyme
interacts specifically with the substrate

Complimentary
recognition
Molecular basis for
substrate recognition

Hydrogen bonds
glue the substrate
to the enzyme
STEP 3: catalysis

Enzyme interacts with substrate intimately


STEP 3: catalysis

This is where
RNA is cut
Enzyme surface ‘feels’
the unusual twist and
‘knows’ where to cut

RNA strand is
twisted, splaying
the cleavage site
STEP 3: catalysis

Peek into the heart


Active site of the enzyme
STEP 3: catalysis

Unveiling the chemistry behind


enzymatic RNA cleavage

cleavage site
STEP 3: catalysis

Bond broken:
RNA strand is CUT

New chemical linkage is formed


Nucleotides close to the ‘cut’ site maybe catalytically important

catalytically important
Which functional group is important?
What are the catalytic interactions

7 6 1
5
8
2
9 4
3
Adenine

purine 6-methyl 2-aminopurine 2,6-diaminopurine guanine


adenine
Mg2+ close to the ‘cut’ site maybe catalytically important

catalytically important?
Is an active site Mg2+ catalytically important?

hard hard soft


hard soft soft

Cd2+ binds S better than Mg2+

Metal rescue experiment


If activity is reduced when O is replaced by S
with Mg2+ and rescued when is Mg2+ is
replaced by Cd2+ metal has catalytic role
Natural ribozymes What about the lost ribozymes from the RNA World?

Can we bring them back?


What are the enzyme functions that
would have been useful in the RNA World?

RNA catabolism

RNA anabolism

Translation

Nucleoside synthesis from


nucleobase + sugar

Transcription

1st step to Translation


Chen X, Li N, Ellington AD. Chem Biodivers. 2007 4:633-655S
Evolving ribozymes in lab: limited only by creativity

Astrobiologist’s dream: self-replicating ribozyme


Retrosynthetic analysis of a living cell

3. Biochemistry problem

1. Synthetic chemistry problem


How does an RNA behave like an enzyme?

Primitive cell RNA enzyme (ribozyme) Short pieces of RNA (oligos) Monomer (nucleotide) ‘Feedstock’ molecules

2. Assembly problem
Retrosynthetic analysis of a living cell

RNA
gene enzyme

Cell membrane

Primitive cell
primitive cell membrane primitive modern modern cell membrane
PRIMITIVE CELLS WERE likely BOUNDED BY FATTY ACIDS

primitive cell membrane primitive modern modern cell membrane

Meteorites contain:
fatty acids, RNA nucleobases, ribose sugar

Perfect ingredients of cells made of fatty acid


membranes with RNA inside
PRIMITIVE CELLS WERE likely BOUNDED BY FATTY ACIDS

Fatty acids spontaneously assemble


to form spherical compartments
PRIMITIVE CELLS: FATTY ACIDS encapsulating RNA

Fatty acids spontaneously assemble


to form spherical compartments

substrate

RNA enzyme RNA enzymes function within primitive cells


under a microscope

TOWARD CELLULAR LIFE: a fatty sac full of RNA

fatty acid
membrane RNA
Fatty acid cellular
RNA building blocks compartment

“Be fruitful and multiply”


NON-LIVING MATTER STARTS TO EVOLVE

Better RNA enzymes Weaker RNA enzymes


make RNA faster make RNA slower

Accumulation of RNA
triggers faster cell division

Cell division can’t keep up


with its competition

LIFE is born
With the PPI lab…

2012 2016

dasgupta@molbio.mgh.harvard.edu

You might also like