You are on page 1of 52

1 2022 Molecular Biology

2 2022 Molecular Biology

Nucleic acids are polymers of nucleotides (polynucleotides)


Types of nucleic acids
1. Deoxyribonucleic acid -DNA – polymers of deoxyribonucleotides [in nucleus (eu- linear) (pro-
circular, bare) and mitochondria (circular)]
2. Ribonucleic acid - RNA – polymer of ribonucleotides (in nucleus, cytoplasm and mitochondria) –
90%
Rosalind Franklin took X-ray diffraction photographs of DNA.
James Watson and Francis Crick proposed structure of DNA in 1953.

Purine
(1) Adenine – 6 amino purine
(2) Guanine – 2 amino 6-oxypurine
Pyrimidines
(1) Cytosine ( 2oxy 4 amino pyrimidine)
(2) Thymine (2,4-dioxy 5 methyl pyrimidine)
(3) Uracil (2,4 – dioxypyrimidine)
Bases
1. Purine base; adenine and guanine (2 rings, in both DNA and RNA)
2. Pyrimidine base; cytosine, thymine (DNA) and uracil (RNA) – 1 ring
Nucleosides and nucleotides
• Nucleosides = Sugar + base (C1 of sugar by N – glycosidic bond)
• Nucleotides = Sugar + base + phosphate (phosphate group at 5’end of sugar by ester bond)
• Nucleotide - nucleoside mono or di or triphosphate
• Two terminal phosphate bonds are acid anhydride bonds possessing high energy (- 7.3 kcal/mol)
Nucleic acid strand
• Nucleotides are linked together through joining of 3’OH end moiety of 1 st nucleotide and phosphate
group at 5’positionof pentose of 2nd nucleotide via phosphodiester bond.
• Each strand has polarity; 5’and 3’end. At the 5’end, phosphate group and at the 3’end, a hydroxyl
group is often found.

Base Deoxyribonucleosides Deoxyribonucleotides


Adenine Deoxyadenosine Deoxyadenosine monophosphate Deoxyadenylic acid
Guanine Deoxyguanosine Deoxyguanosine monophosphate Deoxyguanylic acid
Cytosine Deoxycytidine Deoxycytidine monophosphate Deoxycytidylic acid
thymine Deoxythymidine Deoxythymidine monophosphate Deoxythymidylic acid
3 2022 Molecular Biology

Base Ribonucleosides Ribonucleotides


Adenine Adenosine Adenosine monophosphate Adenylic acid
Guanine Guanosine Guanosine monophosphate Guanylic acid
Cytosine Cytidine Cytidine monophosphate Cytidylic acid
Uracil Uridine Uridine monophosphate Uridylic acid

Structure of DNA
DNA is the chemical basis of heredity. The genetic information of all living organisms, except RNA
viruses, is stored in DNA. B-form described by Watson and Crick. It is a double stranded structure in the
form of double helix (right-handed). Two strands are anti-parallel i.e., 5’ to 3’ direction of 2 polynucleotide
strands run in opposite direction and this arrangement produces a stable association between strands.
Each strand is polymer of deoxyribonucleotides (deoxyribose, N-base & phosphate). Nucleotides are
linked by 3' 5' phosphodiester bonds. N-base in DNA are purine (adenine and guanine) and pyrimidine
(thymine and cytosine). Hydrophilic deoxyribose and phosphate forms backbone on the outside while
hydrophobic bases are stacked inside of the molecule. The base sequence carries genetic information.
The width of a double helix is 20A° (2nm). Each turn of the helix is 34A° (3.4nm) with 10 base pairs.
Interwinding of the 2 antiparallel strands produces major groove and minor groove. The number of base pair
and the length of DNA vary from species to species. Base pairing & hydrophobic base stacking interactions
hold the two DNA strands together and maintain the stability of DNA double helices. Base pairing is
complementary; A and T are paired by 2 hydrogen bonds and G and C by 3 hydrogen bonds. Due to specific
base pairing rule (Chargaff’s rule), the total purine equals total pyrimidine. Nucleus contains linear DNA
packed into chromosome and mitochondria contain circular bare DNA.
Alternative forms of DNA; A to E and Z families shown by X-ray crystallography. In high salt
concentration, A form exists and is right-handed helix with 11 bp per turn but it never exists as in vivo. When
the relative humidity of B-form DNA falls to less than 75%, B form undergoes a relative transition into A
form. Z form is a left handed helix, formed by alternating G:C and C:G having 12 bp per turn. It is favored at
high ionic concentration. It has novel conformation like zigzag configuration and its function is still not clear.

Denaturation and renaturation of DNA


Denaturation disrupts hydrogen bonds between bases resulting in single strands. It is causes by
increased temperature or decreased salt concentration. The DNA strands separate over a temperature range.
The mid-point called Tm is influenced by the base composition of DNA and the salt concentration of the
solution. DNA rich in G-C pairs melts at a higher temperature than that rich in A-T pairs. It can be renatured
(annealed) if the denaturing condition is slowly removed.
4 2022 Molecular Biology

Functions of DNA
1. Genetic information in DNA serves as source of information for synthesis of all proteins of cells and
organism. Thus, DNA serves as template for transcription into RNA which is then translated to specific
protein.
2. It provides progeny with genetic information possessed by the parent. Thus both strands of DNA serve as
template for replication into daughter DNA.

Chromosome

Every cell of a multicellular organism contains the same genetic materials. Human genome is made
up of 3.2 x 109 nucleotides distributed over 24 different chromosomes. Each chromosome contains between
48 million and 240 million base pairs. They are divided among different sizes of 23 pairs of distinct
chromosomes (22 autosome and X and Y chromosomes).
Each human somatic cell contains two copies of each chromosome, one inherited from the mother and
one from the father. The maternal and paternal chromosomes of a pair are called homologous chromosomes
(homologs). The only non-homologous chromosome pairs are the sex chromosomes in males, where Y
chromosome is inherited from father and X chromosome from the mother. Somatic cells contain diploid
number; 22 pairs of autosome and one sex chromosome. Germ cells contain haploid numbers; 22 autosome
and either X or Y chromosomes). Large excess DNA that does not carry critical information is called junk
DNA. They act as spacer material and are essential for long term evolution of the species and proper
expression of genes.
Chromosomes are called depending on position of centromere.
1. Metacentric chromosome when centromete is near the middle of chromosome
2. Acrocentric chromosome when near the telomere
3. Submetacentric chromosome when centromere is located between centromere and telomere.
DNA sequences on both sides of centromere are designated as arms of a chromosome; long arm (q)
and short arm (p). The arms are subdivided by regions.
Maps to describe the location of a particular gene on a chromosome
1. Map using cytogenetic location which is based on a distinctive pattern of bands created by
staining of chromosomes with certain chemicals
2. Map using the molecular location which is based on the sequence of DNA base pairs that describe
the precise location of gene on chromosome
5 2022 Molecular Biology

Organization of Eukaryotic chromatin

Each chromosome of eukaryotic cells contains single, large duplex DNA molecule. DNA in the
chromosome is tightly associated with histone proteins into nucleosomes. Histones are highly conserved
protein and are positively charged proteins, rich in Arginine and lysine. Thus they can bind the negative
charges of sugar-phosphate backbone of DNA to reduce the electrostatic repulsion and allow tighter packing.
Histones remodeling or modification is important for the regulation of chromatin structure and in
controlling of gene expression. It is done by enzymatic covalent modification such as methylation, acetylation,
ADP-ribosylation, phosphorylation, glycosylation or ubiquitination.
Fiber-like unpacked chromosome is called chromatin. Nucleosome are fundamental units of
chromatin. Nucleosome consists of histone octamer (H2A, H2B, H 3, H4)2 encircled by double stranded
DNA (about 146bp). Outside the nucleosome, H1 prevents unwinding of DNA segment from nucleosome.
Nucleosomes are joined by linker DNA of about 20 to 90 bp.
The nucleosome cores organize into a structure called the 30nm fiber. At the next level, 30nm fiber is
folded into loops. The loops are bound to a protein scaffold consisting of H1 histone and several non-histone
proteins, Sc1 (a topoisomerase II) and Sc2. Fibers supercoiled into chromatin and then forms compact
chromosome. Presence of nucleosome, it can be seen as beads on string appearance under electron
microscope. There are 2 types of chromatin; transcriptionally inactive Heterochromatin and transcriptionally
active Euchromatin. Under electron microscope, heterochromatin is densely packed and euchromatin is seen
as loose strand.
The important proteins in nucleus are histone and nucleoplasmin. Nucleoplasmin is an anionic
pentametric protein. It reversibly binds with histone but does not bind to DNA or chromatin. It is important
for gene expression and replication.
6 2022 Molecular Biology

Cell Cycle
Cell cycle is the orderly sequence of events by which a cell duplicates its chromosomes and other cell
contents followed by division of a cell into two genetically identical daughter cells.
Four sequential phases
1. G1 phase (Gap 1) – period of cell growth and differentiation prior to replication
2. S (synthetic) – DNA synthesis occurs causing duplication of chromosomes.
3. G2 (Gap 2) – period after replication and preparatory phase before cell division
4. M (Mitosis)- Chromosomes segregation and cell division occur. It comprises two major events:
nuclear division or mitosis during which duplicated chromosomes are distributed equally and exactly
to the daughter cells and cytoplasmic division or cytokinesis.
An additional phase is the G0 phase in which the cell is in a quiescent state. A combination of G1, S, G2 are
called interphase and gene expression occurs throughout. The two gap phases allow time for cell growth and
provide check points (G1 and G2 check points).
To re-enter the cell cycle, growth factor is needed. If external conditions are unfavorable, cells delay
progress through G1 and may even enter the G0.
The length of the cell cycle varies among different types of cells. In human body, many cells divide
frequently e.g., hair follicles, skin cells. The precursor of RBC divides a number of times. While fibroblast
and epithelial cells may spend very little or no time in G0 and adult liver cells, brain cells and myocytes spend
most of the time in G0. Early cleavage division in embryonic cells are rapid, in which G1 and G2 are
completely omitted, and the cells cycle rapidly between M and S phase.
Overview of cell proliferation and growth
Cells of a multicellular organism have to receive positive signals in order to grow and divide. Growth
factor binds to their specific cell surface receptors and initiates selective signaling cascade.
The important check points occur at 3 stages: at G1-S transition, during S phase and G2-M boundary.
G1 check point is more complex and is under strict control. Four types of cyclins, 5 types of cyclin-dependent
kinase (CDK) regulate cell cycle transition points.
Mechanism of Cell cycle progress
All eukaryotic cells have gene products (proteins) that govern the transition from one phase of the cell
cycle to another. Growth factor stimulation results in induction of genes producing proteins e.g. cyclins for
cell cycle progression. Cyclins concentration increases and decreases at specific times and are abruptly
destroyed during mitosis.
Cyclin activates specific cyclin dependent protein kinases (CDK) that in turn phosphorylates many
proteins for progression through the cell cycle. When cell is exposed to mitogen in G1 phase, cyclin D level
becomes rise. By activating CDK4 and CDK6, cyclin D induces the synthesis of cyclin E. cyclin E and CDK2
make the cell to pass G1 checkpoint and initiation of DNA synthesis in early S phase. Cyclin A, CDK1 and
CDK2 bring the cell through S phase and remain active through G2 phase. Cyclin B and CDK1 derive the
transition from G2 to M phase.
7 2022 Molecular Biology

Regulation of cell cycle progress


G1 checkpoint is a restriction point in late G1 phase of cell cycle, beyond which cells is irrevocably
proceeded into S or DNA synthesis phase. In quiescent cells, retinoblastoma (Rb) protein which is a cell cycle
regulator binds and inactivates transcription factor E2F. E2F is necessary for the transcription of certain genes
e.g. histones genes, DNA replication proteins. In growth factor activation, cyclin D, CDK4 and CDK6
complex phosphorylate Rb causing release of E2F and cell cycle progression occurs.
Termination of cyclin-CDK function in a phase
Cyclin-CDK complex is rapidly inhibited by CDK inhibitor protein, desentization of the complex and
degradation of cyclins.
Cell cycle check point
Tumor suppressor protein, p53 is activated by DNA damage. It is a transcription factor that induces the
expression of G1-CDK inhibitor protein: p21, allowing the cell to repair its DNA before proceeding into S
phase. If the damage is too extensive, the p53 level highly increases which initiate programmed cell death
called apoptosis.
Biomedical importance
Mutation of retinoblastoma genes can cause Retinoblastoma. Certain tumor antigens derived from
viruses such as SV40, HSV, HPV may combine with Rb. Then Rb cannot inhibit cell cycle, leading to
continuous cell division and cancer.
P53 involves in cell cycle regulation and apoptosis. It is mutated or lost in 50% of human cancers.
8 2022 Molecular Biology

Apoptosis

Apoptosis is the process of programmed cell death to limit the growth and proliferation of cells that
occurs in multicellular organisms. It is also a physiologic process involved in various developmental and
physiological processes. Cells unnecessary or threatened to the organism undergo apoptosis.
Apoptosis is mediated by proteolytic enzymes, caspases that trigger cell death by cleaving specific
proteins in cytoplasm and nucleus resulting in
• Halting cell cycle progression
• Disabling homeostatic and repair mechanisms
• Initiating the detachment of the cell from its surrounding tissue structures
• Dismantling structural components such as the cytoskeleton
• Flagging the dying cell for phagocytosis
Dramatic morphological changes in the cells are
• Shrinkage of the cytoplasm
• DNA fragmentation
• Plasma membrane blebbing
• Formation of membrane – enclosed vesicles (apoptotic bodies) and engulfment by phagocytes
• Cell death without lysis or damage to neighboring cells
Apoptosis mediating genes (suicidal genes) (oncopressor genes) are c-fos, p53, Rb. Apoptosis
protecting genes are bcl-2 and other oncogenes.
Stress and other stimuli activate certain cell surface receptors, and a cascade of activation takes place.
Cell death receptors such as tumor necrosis factor (TNF-R) and Fas (also known as CD95 or APO-1) mediate
apoptosis in a number of cell types especially in immune cells. This initiate apoptosis by directly recruiting
procaspases resulting in caspases activation cascade.
Caspase activation cascade
CASPASE (cysteinyl aspartate specific protease) are proteases with cysteine in the active centre. They are
secreted as inactive procaspases and are activated one by one. The caspase 8 is the 1 st one activated (initiator)
and the final one is caspase 3, the executor of death (Yama). Cytochrome c released from mitochondria into
cytosol activates procaspases 9 to caspase 9 causing activation of the whole pathway. This process is inhibited
by B cell lymphoma 2 (bcl 2) by regulating mitochondrial integrity and cytochrome c release. Tumor
suppressor protein p53 activates apoptosis in response to DNA damage.
Biomedical importance
Inappropriate apoptosis machinery can lead to degenerative disorders and subversion and disruption
of the apoptosis machinery can result in cancer or autoimmune disease.
9 2022 Molecular Biology

Replication

Replication is the process by which each strands of parental DNA duplex is copied precisely by base
pairing with complementary deoxyribonucleotides to form daughter DNA molecules.
Common features of replication
➢ It must be complete and carried out with high fidelity to maintain genetic stability within the organism
and species.
➢ General steps are similar in eukaryotes and prokaryotes.
➢ Sequences on DNA templates are copied specifically and accurately by complementary base pairing
rule.
➢ Polymerization of new strand takes from 5’ to 3’ as polymerase joins nucleotides only by 3’ 5’
phosphodiester bond.
➢ It occurs during the S phase of cell cycle. In eukaryotes, all parts of the genome are replicated only once
during each cell cycle.
➢ Both strands serve as template and replicate simultaneously. Replication occurs Bi-directionally from
replication bubble.
➢ Semi-discontinuous process: Although both strands are polymerized 5’ to 3’, one strand is continuously
(leading strand) and other is discontinuously (lagging strand).
➢ Semi-conservative - one strand of parent DNA molecule is conserved in each new double helix, paired
with newly synthesized complementary strand.
10 2022 Molecular Biology

Eukaryotic DNA replication


Separating parental strands
• Eukaryotic DNA is Linear DNA. Chromosomes are relaxed into chromatin which contains
nucleosome structure. Chromatin must be released into bare DNA and histones.
• Eukaryotic DNA has multiple replication origins. DNA binding proteins bind to specific sequences
located within the origin.
• DNA helicase allows ATP dependent unwinding of DNA in both directions from origin of replication.
Multiple unwinding is seen at multiple replication origins. Single strand DNA binding protein
stabilizes DNA unwinding and inhibits their re-association resulting in replication fork. Supercoiled
DNA occurred between replication bubbles are relieved by ATP dependent DNA topoisomerase.
Strand elongation
• DNA polymerase catalyzes the addition of mononucleotides to a growing chain DNA from 5’ to 3’
direction. Nucleotide precursors are deoxyribonucleoside triphosphates (dNTP). Neighboring
nucleotides are joined by phosphodiester bonds, releasing pyrophosphate in each reaction.
• Polymerization occurs according to base paring rule. Two daughter strands; lagging strand is
synthesized as DNA fragment; Okazaki fragment and leading strand as continuously because DNA
polymerization can occur only in 5’ to 3’ direction.
• DNA polymerases can’t initiate replication de novo. The primase (RNA polymerase) synthesizes
RNA oligonucleotide primer complementary to each parental DNA strand.
Accuracy of replication
• There is about one mistake per 109 nucleotides copied.
• DNA polymerase removes mispaired nucleotides from 3’ end of the growing chain as it has 3’ to 5’
exonuclease activity (Proof reading property). Despite their very high accuracy, polymerases can
incorporate nucleotide analogs used in chemotherapy.
Primer removal and gap filling
• The RNA primers are removed from the 5’ end of Okasaki fragments by RNAse H activity. Then the
gap is filled by deoxyribonucleotides and tightened by DNA ligase. Finally they are organized into
chromatin.
Post replication modification
• Newly synthesized DNA is glycosylated and methylated by specific enzymes. Enzymatic methylation of
adenine residue protect from destruction by restriction endonucleases.
11 2022 Molecular Biology

Biomedical importance
1. Inhibitors of DNA gyrase are potent antibiotics.
E.g., Quinolone and fluoroquinolone; Nalidixic acid, Ciprofloxacin, Norfloxacin
2. Etoposide and doxorubicin are used in cancer treatment. They are inhibitors of topoisomerase II.
3. DNA intercalator (insertion of a molecule into DNA) prevent unwinding of DNA also used for cancer.
E.g., doxorubicin
4. Nucleotide analog inhibits deoxyribonucleotide polymerization are generally anti-cancer or anti-viral
agent.
5. In somatic cells, level of telomerase is turned down. Stem cells retain full telomerase activity. Cancer cells
have continued presence of telomerase which is a potential target of new anticancer drugs.

Telomere replication
Telomeres are the repetitive sequences (TTAGGG in human) at the end of the chromosomal DNA.
During replication, DNA synthesis is restricted because there is no place to produce the RNA primer needed
to start the last Okasaki fragment at lagging strand. And new strand is shorter at 5’ end with each round of
replication and gene would be lost.
The enzyme telomerase is reverse transcriptase with internal RNA template. After extension of 3’ end
of the parental DNA strand by telomerase, replication of the lagging strand at the chromosome end can be
completed by the conventional DNA polymerase.
Telomerase level is turned down in somatic cells. Stem cells retain full telomerase activity. Cancer
cells have continued presence of telomerase and chromosome length equilibrium is maintained, leading to
continued cell division. Telomerase is a potential target for newer anticancer drug.
Telomerase length and aging in humans
• In culture, human cells divide only 20-70 cell generations before senescence and death occurs. A
correlation is observed between telomere length and the number of cell divisions and also aging.
• In progerias, inherited diseases characterized by premature aging, their somatic cells have short telomeres
and exhibit decreased proliferative capacity in the culture.
12 2022 Molecular Biology

Reverse transcription
Reverse transcription is RNA directed synthesis of DNA, catalyzed by reverse transcriptase. The genetic
material for some viruses is RNA. The retro virus e.g. HIV contains virally encoded reverse transcriptase. The
enzyme 1st synthesizes double-stranded DNA from its RNA template. In many cases, resulted dsDNA is
integrated into host genome and gene expression of viral RNA genome and mRNAs occurs.
Importance of retrovirus
• Integration of dsDNA copy of retrovirus into the chromosome of the infected cell can transform the cells
into cancerous cells.
• HIV retrovirus causes acquired immunodeficiency disease (AIDS).
• Retrovirus can be used in gene therapy.
Clinical Importance
1. Several important antiviral drugs are nucleotide analogs. They inhibit reverse transcriptase activity.
e.g., Azido 2’, 3’-dideoxythymidine (AZT), 2’, 3’-dideoxycytidine (ddC), dideoxy inosine (ddI) in AIDS
2. Reverse transcriptase can be used to make dsDNA copies from various RNAs in genetic engineering.

Differences DNA replication Prokaryotes Eukaryotes


DNAP I – replication & repair 5 DNAP
mismatch Α, δ- nuclear DNA replication (α-
II – repair damage DNA primase)
III – chain elongation Β, ε – nuclear DNA repair
γ – mt DNA replication
Initiation Chromatin structure Nil (+)
Ori Single ORC (multiple)
Ori recognition dnaA protein Unknown
Unwinding of DNA double helix dnaB Helicase
Removal of positive supercoils DNA gyrase DNA topoisomerase I and II
Synthesis of primer Primase Primase - DNAP α
Elongation Leading strand synthesis DNAP III DNAP δ
Lagging strand synthesis DNAP III DNAP α
Termination Replacement of RNA with DNA DNAP I DNAP β, ε
(proof reading)
Telomere synthesis Nil Telomerase
Joining of Okasaki fragments DNA ligase DNA ligase
Chromatin structure Nil Reconstitution
13 2022 Molecular Biology

Genomic stability

A typical mammalian cell accumulates many thousands of lesions during a 24-hour period. As a result
of DNA repair, fewer than 1 in 1000 becomes a mutation. Genomic stability is important for health of the
individual and for maintenance of the species.
DNA is more stable than RNA but has limited chemical stability. Spontaneous damage of DNA is the
major factor in mutagenesis and ageing.
Cause of DNA damage
1. Spontaneous damage or error during replication is a major factor in mutation and ageing.
e.g. Hydrolysis, oxidation, Non-enzymatic methylation
2. Physical agents such as UV rays and radiation
3. Chemical agents; dyes, drugs, heavy metal, petroleum products
o Aflatoxin by mold in peanuts undergoes epoxidation by Cyt P450 and causes base alteration.
o Benzopyrene (cigarette smoke) causes base pair alteration.
4. Biological agents - Viral infections and fungus
o Base analog and virus infection change in DNA sequence.

Types of damage
1. Single base alterations due to
• Depurination – purine N-glycosidic bonds are especially labile.
• Deamination of cytosine to uracil, adenine to hypoxanthine, guanine to xanthine
• Insertion or deletion of single nucleotide
• Alkylation
• Base analog incorporation
2. Two base alteration due to
• UV light induced thymine-thymine dimer
• Alkylating agent cross linkage
3. Chain breaks which may be caused by
• Single-stranded breaks by Ionizing radiation, radioactive substance, free radicals
• Double-stranded breaks by Ionizing radiation, some chemotherapeutic agent
4. Cross linkage
• between bases in same or opposite strands
• between DNA & protein molecules
14 2022 Molecular Biology

Mechanisms of DNA repair


1. Mismatch repair mechanism
It corrects errors of unpaired 1-5 bases during replication. Specific protein in the mismatched repair
system scans methylated adenine within a GATC sequence in a template strand and identify the mismatched
base in newly synthesized unmethylated strand. The strand bearing mutation is cut by GATC endonucleases,
and an exonuclease digests through the mutation thus removing the faulty DNA. The gap is filled by DNA
Polymerase and the strands are joined by DNA ligase.

2. Base excision repair mechanism


Depurination is a spontaneous process which occurs at a rate of 10 000per cell per day. Thus cytosine,
adenine and guanine base spontaneously form uracil, hypoxanthine and xanthine respectively. These are not
normal bases of DNA.
Specific DNA glycosylase cleaves N-glycosidic bond between the damaged base and deoxyribose and
the base is removed. The phosphodiester bond at 5’ or 3’ of the apurinic or apyrimidinic site is cleaved by AP-
endonucleases and phosphodiesterase excise the remaining phosphodiester bond. The gap is filled by DNA
Pol and sealed by DNA ligase.

3. Nucleotide excision repair mechanism (NER)


The damaged DNA is removed as oligonucleotide fragment by special Excinuclease. It involves
hydrolysis of 2 phosphodieater bonds on the strand containing the defect. It cuts the strand above and below
the defective region. The gap is filled by DNA polymerase and the strand is sealed by DNA ligase.

4. Strand break repair mechanism


Single strand breaks are frequently induced by ionizing radiation. These are repaired by direct ligation
or by excision repair mechanism by serving undamaged strand as template.
Double strand breaks are more dangerous because they can lead to chromosome breakage and
rearrangements. Double strand breaks are produced by ionizing radiation and some chemotherapeutic agents.
Non-homologous end joining (NHEJ) is major pathway and involves Ku protein and DNA- dependent
protein kinase. Ku protein with ATP-dependent helicase activity binds to DNA ends. Then DNA-Ku recruits
DNA dependent protein kinase which becomes activated. Active kinase then phosphorylates Ku on both ends
and it dissociates from DNA. It results activation of Ku protein that unwinds two ends of DNA. Approximated
DNA forms base pairs and extra nucleotide tails are removed by exonuclease. The gaps are filled by DNA
polymerase and joined by DNA ligase.
Homologous recombination (HR) repairs double-strand breaks without introducing mutation but
requires the presence of homologous chromosome.
15 2022 Molecular Biology

Biomedical importance
Defect in repair system of DNA increases frequency of mutation and can cause cancer.
1. Skin fibroblasts from patients with Xeroderma pigmentosum have defect in excinuclease of NER.
2. Hereditary nonpolyposis colorectal cancer or Lynch syndrome results from mutation of gene of protein
involved in mismatch repair system.
3. Ataxia- telangiectasia is due to defective double strands break repair.

Repair mechanism Enzymes


Mismatch GATC endonuclease
Exonuclease
Base excision DNA glycosylases
AP endonuclease
Phosphodiesterase
Nucleotide excision Exinuclease
Double strand break Ku protein
DNA - PK
16 2022 Molecular Biology

Functional Eukaryotic Gene

• A gene is defined as a segment of DNA (or in a few cases RNA) which encodes the information that is
required to produce functional biological products; protein or one of several classes of RNA molecules.
• Genes are located on the chromosomes. Site of gene in a chromosome is called locus. The alternative
form of a gene is called allele. Direction of a gene is 5’ to 3’ direction.
• The main function of a gene is to express the characters of genetic information that carries. Gene
expression includes transcription- formation of RNA from DNA and translation – formation of a protein
from RNA. Gene expression is tissue and time specific in nature e.g., insulin protein, embryonic and fetal
hemoglobin.
• According to the gene expression pattern, there are two types of genes. Inducible gene responses to
regulatory signal and a constitutive gene that expresses at a constant rate and does not response to
regulation. Their product proteins are always necessary for cellular metabolism and are called house-
keeping genes.
A functional eukaryotic gene contains two large regions.
1. Structural region
It contains alternating (a) exon and (b) intron (non-coding segments). Exons are coding segments that contains
information for protein synthesis. Introns are important for structure and regulation of gene. They are removed
from precursor RNA before being transported into the cytoplasm.
2. Regulatory region consists of two classes of elements.
(a) Basal expression region
It is essential for gene expression at basal condition. It is usually located at the 5’ end and cis-acting as
they are located near to the gene.
• Proximal component or TATA box located about -25bp to -30 bp upstream from the transcription
start site. It binds and directs RNA polymerase to the starting point of transcription. In mammals, the
exact sequence in TATA box is slightly different and is known as Goldberg-Hogness box.
• Upstream element or CAAT box located about -70bp to -80bp and specifies the frequency of
initiation.
(b) Regulated expression region is located in variety of places. It regulates rate of gene expression.
The regions are responded to various signals such as hormones, metals & chemicals.
• Enhancer element increases rate of gene expression. Silencer element decreases rate of gene
expression.
• other regulatory elements e.g., HRE
• Genes do not function autonomously. The protein-DNA interaction at the regulatory region regulates gene
expression e.g., Transcription factors. The DNA- binding protein has 4 structural motifs namely helix-turn-
helix, helix-loop-helix, Zn finger and Leucine zipper.
17 2022 Molecular Biology

• Cis-acting elements – DNA sequences in the vicinity of the structural portion of a gene that
are required for gene expression.
• Trans-acting factors – factors, usually considered to be proteins, that bind to the cis-acting
sequences to control gene expression.

Gene Expression

Gene expression is a process by which genetic information, carried as DNA sequences on an individual gene
is transformed into individual polypeptides or protein. It can be divided into two major parts; transcription and
translation

DNA Transcription (RNA Biosynthesis)

Transcription is a process by which the information contained in DNA (a gene) is copied by base pairing, to
form a complementary sequence of ribonucleotides, the RNA chain.
Template strand or anti-sense strand – the strand (3’ to 5’ direction) that is transcribed into an RNA molecule.
Coding strand or sense strand – the other strand (5’ to 3’ direction)
Characteristics of transcription
1. General steps are similar in eukaryotes and prokaryotes except transcription takes place in nucleus in
eukaryotes and in cytoplasm in prokaryotes. Processing of nearly all eukaryotic mRNA precursors occur.
2. It occurs in transcriptionally active euchromatin or chromosome in eukaryotes.
3. RNA is transcribed from template DNA strand (in 3’to 5’direction). Polymerization of RNA strand takes
place from 5' to 3' direction.
4. RNA has the same sequence to coding strand except U for T.
5. The base pairing rule is always maintained.

Transcription in Eukaryotes (RNA Biosynthesis)

In Eukaryotes, transcription takes place in nucleus. It requires template DNA, RNA polymerase,
ribonucleoside triphosphate and many transcription factors proteins and co-activator proteins. Splicing occurs
in nearly all RNA precursors.
In eukaryotes, it has 3 types of RNA polymerases. They differ in template specificity, localization and
susceptivity of inhibitors.
RNA polymerase I located in nucleoli transcribes the genes for 18s, 5.8s, and 28s rRNA.
RNA polymerase II located in nucleoplasm synthesizes the mRNA and snRNA.
18 2022 Molecular Biology

RNA polymerase III located in nucleoplasm synthesizes 5s rRNA and tRNA.

In eukaryotes, to locate RNA polymerase at the transcription start site for initiation of transcription,
interaction of transcription factors, trans-activator proteins and co-regulator proteins with the enhancers and
other cis-acting sequence elements.
In initiation of transcription, binding of TATA box binding proteins (TBP) with TBP-associated
factors (TAF) form TFIID. Binding of TFIID complex to TATA box is the 1st step in transcription process.
Other proteins associated with transcription initiation are TFIIA, B, E, F, H and RNA polymerase II.
TATA box located at about -25bp from the transcription start site directs the RNA polymerase to the
transcription start site. Additional element CAAT box specifies the rate of transcription, and some contain a
GC box. Transcription is further stimulated by enhancer elements, located in either upstream or downstream
of the gene.
RNA chains are synthesized from 5’ to 3’ direction. Four different types of nucleoside triphosphate:
ATP, GTP, CTP and UTP are used as substrates. The mechanism of information transfer is according to
complementary base pairing rule. The 1st and 2nd nucleotide attaché to the initiation site and RNA polymerase
catalyzes the formation of 1st phosphodiester bond.
After formation of 1st phosphodiester bond, elongation starts at transcription bubbles – the region
containing RNA polymerase, DNA and nascent RNA that moves along the DNA template. RNA Polymerase
can perform de novo synthesis of RNA chain and primer formation is not required in transcription. It does not
have nuclease activity and cannot do proof reading.
The signal for termination of transcription by eukaryotic RNA polymerase II are poorly understood.
Formation of phosphodiester bond is ceased when termination signal is reached. The RNA-DNA hybrid
dissociates and the melt region of DNA strands rewinds. Then RNA polymerase releases from the template.
RNA processing of primary transcript occurs in the nucleus and forms mature RNA. The process
includes removal of extra nucleotides, base modification, addition of nucleotides and separation of different
RNA sequences by the action of specific nucleases.

Formation of mature messenger RNA (mRNA)


The primary transcript or heterogenous nuclear RNA (hnRNA) or pre-mRNA is a faithful copy of a
gene. It is modified by capping, tailing and splicing.
A cap, 7-methyl guanosine triphosphate residue is added to the 5’ end of hnRNA. About 20
nucleotides downstream from an AAUAA recognition sequence is cleaved and a poly A containing about
200A nucleotides is added at the 3’ end of mRNA.
RNA splicing involves removal of introns and ligation of remaining exons. Five snRNA: U1, U2, U4
and U6 are complex with protein subunits to form a snRNP (small nuclear ribonucleoprotein). This snRNP is
the core of the spliceosome. Consensus sequences for RNA splicing are GU and AG dinucleotides in either
end of the intron sequence. The junction between 5’ exon and intron is cut. The free 5’ terminal forms a loop
19 2022 Molecular Biology

or lariat structure that is linked to 3’ splice site. A second cut is made at the junction of intron with 3’ exon
and the lariat structure containing the intron is released and hydrolyzed. The 5’ and 3’ exons are ligated to
form a continuous sequence.

Formation of transfer RNA


Both prokaryotic and eukaryotic tRNA are transcribed as larger precursors. The primary transcript is
cleaved by ribonuclease P on the 5’ side of the 1st nucleotide. Ribonuclease D trims the exposed 3’ end. Then
sequence CCA is added in the 3’ terminus by tRNA nucleotidyl transferase. Further modifications include
nucleotide alkylation, methylation, and thiolation resulting in the formation of mature tRNA.

Formation of ribosomal RNA (rRNA)


A single large precursor 45S RNA transcript is cleaved to 28S rRNA, 18S rRNA and 5.8S rRNA. The
5S rRNA is transcribed separately by RNA polymerase III. Assemble of 28S rRNA and 5.8S rRNA with
ribosomal proteins forms larger 60S subunit. The association of 18S rRNA with appropriate ribosome protein
forms 40S subunit.

Transcription in prokaryotes

In prokaryotic both transcription and translation takes place in cytoplasm. It has a single form of RNA Pol
although different sigma factors may be involved in initiation of different genes. In E. coli, RNAP is a multi-
subunit enzyme (α2ββ’σ).
Subunit Role
α binds the regulatory sequences
β forms phosphodiester bonds
β’ binds the DNA template
σ recognizes promoter and initiates transcription

Identification of transcription start site is important to obtain desired mRNA. Many prokaryotic
promoters have 2 conserved regions, located about 10 nucleotides and 30 nucleotides upstream (-10 and -
35bp) from the transcription start site. In prokaryotes, transcription factors re not needed and sigma subunits
of RNA polymerase can recognize promoter sites. Then, promoters recruit the RNA polymerase to the
transcription start site.
RNA Polymerase can perform de novo synthesis of RNA chain and primer formation is not required
in transcription. It does not have nuclease activity and cannot do proof reading.
RNA polymerase has an intrinsic unwindase activity that opens the DNA helix. Purine nucleotide is
usually th e1st to be polymerized into RNA molecule. RNA chains are synthesized from 5’ to 3’ direction.
20 2022 Molecular Biology

Four different types of nucleoside triphosphate: ATP, GTP, CTP and UTP are used as substrates. The
mechanism of information transfer is according to complementary base pairing rule.
After formation of 1st phosphodiester bond, elongation starts at transcription bubbles – the region
containing RNA polymerase, DNA and nascent RNA that moves along the DNA template. The superhelical
tension in DNA due to unwinding is controlled by the activity of topoisomerase I and II.
Termination of synthesis of RNA molecule is signaled by a sequence in the template strand of DNA
molecule. Termination occurs by Rho dependent or independent termination.
Rho dependent termination requires rho protein, ATP-dependent helicase. It causes unwinding DNA-
RNA duplex and dissociation of RNA polymerase from the template and stop transcription.
The transcribed region of DNA template contains stop signals. Rho independent termination involves
hair pin loop followed by several U residues leads to termination of transcription. The DNA – RNA hybrid is
unstable because A: U pairs are the most unstable base pair. The nascent RNA dissociates from DNA template
and then from the enzyme.

Biomedical importance of transcription


• Rifampicin inhibits β subunit of prokaryote RNA Pol and impairs 1st phosphodiester bond formation.
• Aminoglycoside blocks the self-splicing mechanism of RNA and interferes the mature RNA synthesis in
prokaryotes.
• Actinomycin D binds tightly and specifically to double helical DNA in both pro and eukaryotes and
prevent it from being an effective template for RNA synthesis.
• Alpha amanitin, a toxin from Amantin phalloides inhibitseukaryotic RNA Pol. II and decreases mRNA
amount and thus protein synthesis.
21 2022 Molecular Biology

Human genome

The total DNA content of a cell is the genome. The genetic information is stored in base sequence of
DNA. It is specific for each species and almost the same for all members of the species, but unique for each
individual. Cellular DNA contains genes and intergenic regions, both serve important functions to the cells.
The human genome contains 20,000 to 25,000 different protein coding genes spreading on 23 pairs of
chromosome. Only about 2% of human genome code proteins and functional RNA.
Most of DNA do not carry critical information (junk DNA). Much of mammalian genome are
redundant. But they regulate the expression of genes during development, differentiation and adaptation to the
environment.

Repeated DNA sequence


There are 3 types of specialized nucleotide sequences called centromere, telomere and replication
origin. Telomere is essential for chromosome stability and cell division. Telomere is the end of chromosome
and consists of short TG rich repeats. Human telomere has variable number of repeats 5'-TTAGGG-3'.
Centromere participates in DNA strand association and chromosomal rearrangements during mitosis.
Replication origin is essential for initiation of replication.
At least 30% of genome consists of repetitive sequences. [Two major classes are middle repetitive and
highly repetitive sequences. Some middle repetitive sequences are genes for tRNA and rRNA, and histones;
other repetitive sequences are no known function and may participate in DNA strands association and
chromosomal rearrangements during meiosis. (e.g. Alu sequences- 300bp long 300,000 to 500,000 in numbers
in human genome).] Repetitive sequence consists of genes that specify transfer and ribosomal ribonucleic
acids.
Transposons (mobile) are DNA elements that have contributed to the evolution.
Microsatellite sequences consist of 2-6 bp repeated up to 50 times, 50,000 to 100,000 in number. The
number varies on two chromosomes and provides an heterozygosity in an individual. It is heritable trait.
Trinucleotide instability and increased number occurs in fragile X syndrome.

Structure & Functions of RNA


RNA is singly strand, polymer of ribonucleotides joined by 3’ to 5’ phosphodiester bond. It is folded
into alpha helix. Ribonucleotide contains ribose sugar, phosphate and purine (adenine or guanine) or
pyrimidine (cytosine or uracil) base. In RNA molecule, purine and pyrimidine bases number are not equal.
22 2022 Molecular Biology

They are found in nucleus, cytoplasm and mitochondria. They are important in production of proteins in living
organisms. RNA forms the genetic material in some viruses.
Different types
1. Messenger RNA (mRNA)
2. Transfer RNA (tRNA)
3. Ribosomal RNA (rRNA)
4. small nuclear RNA( snRNA) (in eukaryotes)
5. small cytoplasmic RNA (scRNA) (in eukaryotes)
mRNA (5% of total RNA, 0.5-6+kb)
Structure
It is oriented as 5’to 3’direction. 5' end is capped by 7 methyl guanosine triphosphate to prevent the
attack of 5' exonuclease. 3’end is attached by poly A tail (adenylate residues 20 to 250 nucleotides) to prevent
the attack of 3' exonuclease. At the 5’ and 3’ ends, there are base paired loop known as untranslated regions. It
plays in essential role in regulation of gene expression.
Function
• It serves as a messenger, conveying the genetic information from nucleus to protein synthesizing
machine.
• It also serves as a template for polymerization of amino acids to protein.

tRNA (15% of total RNA, 65-110 nucleotides)


Structure
There are at least 20 species of tRNA molecules corresponding to each of 20 amino acids. It has
clover leaf shape (secondary structure) and L shape (tertiary structure). Two limbs are not equal, the longer
limb ended with nucleotide sequence of CCA. It contains 5 arms.
a. The acceptor arm is the point of attachment for the carboxyl group of amino acid.
b. Anticodon arm is responsible for the specificity of the tRNA.
c. The D arm is for proper recognition site by aminoacyl tRNA synthetase.
d. T Ψ C arm is involved in binding of aminoacyl tRNA to the ribosomal surface.
e. The extra arm provides a basis for classification.
Function
• tRNA carries activated specific amino acid to the site of protein synthesis.
• It serves as adaptor molecule between codon of mRNA and specific amino acid.
rRNA (80% of total RNA)
Structure – It is associated with proteins in ribosomes. (Svedberg units)
23 2022 Molecular Biology

Mammalian ribosome contains 40S and 60S subunits. The 60S subunit contains 5S rRNA, 5.8S rRNA and 28
S rRNA. The 40S subunit contains 18S rRNA. In prokaryotes, 70S ribosome contains 30S (16SRNA) and 50S
(23S & 5S rRNA) subunits combined with proteins.
Function
• It serves as site for protein synthesis.
• 28S rRNA of 60S subunit contains peptidyl transferase activity and is a ribozyme.

sn RNA (in eukaryotes)


They are significantly involved in mRNA and rRNA processing and gene regulation.

Small non-coding RNAs such as miRNA and siRNA typically inhibit gene expression by hybridizing with
targeted mRNA.

Differences between RNA & DNA

DNA RNA
Strand Double helix Single (α helix)
Polymer of deoxyribonucleotides linked by 3', Polymer of ribonucleotides linked by
5' phosphodiester bonds 3', 5' phosphodiester bonds
Nucleotide Deoxyribose, purine (adenine, guanine), Ribose, purine (adenine, guanine),
pyrimidine (thymine, cytosine) and phosphate pyrimidine (uracil, cytosine) and
phosphate
Purine & Equal because 2 strands are held together by Not equal but single strand of RNA can
pyrimidine complementary base pairing (G with A and T fold itself like hairpin (G with A and U
content with C) with C)
Hydrolyzed by Can’t hydrolyze to 2’, 3’ cyclic diester of To 2’, 3’ cyclic diester of
alkali mononucleotide due to absence of 2’OH group mononucleotide
Functions Template for replication and transcription mRNA, tRNA and rRNA involve in
protein synthesis

Central dogma of molecular biology or flow of genetic information


Replication
24 2022 Molecular Biology

Transcription Translation
DNA RNA Protein

Reverse Transcription

DNA (for new generation)

An organism must able to;


- store and preserve it genetic information
- pass that information along to future generation (replication)
- express that information to carry out all the processes of live

Genetic information in most living organism is stored in base sequence of DNA (in retro virus, it is found in
RNA). DNA is packed into structure called chromosome.
• Gene expression definition - Transcription and Translation
• Transcription definition
• Translation definition
• Replication definition – it occurs before cell division
• Reverse transcription – The genetic information stored in retrovirus is copied to DNA for new viral
generation
25 2022 Molecular Biology

Response element
Is the nucleotide sequence that allows specific stimuli. Response elements are often part of promoters
or enhancers. A single gene may possess a number of different response elements. Multiple genes may
possess same response element and same function.
Transcriptional factors
26 2022 Molecular Biology

They are proteins that recognize promoters, enhancers and response elements. Many transcription
factors act positively and promote transcription, while others act negatively and promote gene silencing

Mitochondria DNA
AUG & AUA = Methionine
UGA = Tryptophan
AGA& AGG = Termination codon

Genetic Code

Nirenberg was awarded the Nobelprize in 1968 for deciphering the genetic code. The letters A, G, T
and C correspond to the nucleotides found in DNA. Within the protein coding genes, these nucleotides are
organized into three-letter code wards called codons. The collection of these codons makes up the genetic
code.
27 2022 Molecular Biology

Combinations of fours nucleotides into three at a time are done, 64 possible codons result. Three
codons, UAA, UAG, UGA, do not specify the amino acid and act as stop codons or non-sense codons. The
codon, AUG, representing for methionine appears as a start codon for protein synthesis.

Salient features of Genetic Code


1. Triplet codons – a codon is a consecutive sequence of three bases on mRNA e.g., UUU code for
phenylalanine.
1. Degenerate (redundancy)
Many amino acids are designated by more than one codon. For example, codons for serine are UCC,
UCA, UCU, & UCG. The 1st two bases are limited to one or two combinations but the 3rd is flexible (wobble).
It minimizes the deleterious effects of mutation.
2. Unambiguous
Each codon specifies only one amino acid without any doubtful meaning.
3. Non-overlapping
The codons are consecutive. After commencing at start codon, the codons are read in a continuous
sequence of triplet nucleotides up to stop codon.
4. No punctuation (commaless)
The codons are not separated from each other by non- coding nucleotides.
5. Universal
It means that all prokaryotes & eukaryotes use nearly the same codons to specify each amino acid. But
there are minor differences in some bacteria and mitochondria. The genetic code has been highly preserved
during evolution.

Biomedical importance
The understanding of genetic code provides the foundation for explanation of protein biosynthesis,
mutation and diagnosis and treatment of genetic diseases.

Translation (Protein synthesis)

Translation is a complex process by which the information that has been transcribed from DNA to
mRNA, direct the ordered polymerization of specific amino acids for the synthesis of proteins. It occurs in
cytoplasm. Ribosome serves as sites of protein synthesis.
28 2022 Molecular Biology

mRNA is translated from 5’ end to 3’ end. The produced protein is started from amino terminal and
ends at carboxy terminal. Translation is generally divided into four steps; formation of aminoacyl tRNA,
initiation, elongation and termination.

1. Formation of aminoacyl tRNA


Amino acid is 1st activated and transferred to the acceptor arm of specific tRNA catalyzed by amino acyl
tRNA synthetase. ATP is hydrolyzed into AMP using 2 high energy bonds.
Amino acid + tRNA + ATP -→ aminoacyl tRNA + AMP + PPi

2. Initiation
Initiation involves several protein-RNA interaction complexes in the ribosome. It involves tRNA, rRNA,
mRNA and at least 10 eukaryotic initiation factors (eIFs). It can be divided into 4 steps.
a. dissociation of the ribosome into 40S and 60 S subunits
b. binding of a ternary complex (the initiator methionyl tRNA, GTP and eIF-2) with 40S ribosome to form
43S pre-initiation complex. In prokaryotes, initiator tRNA carries formyl-methionine.
c. binding of mRNA to 43S pre-initiation complex to form 48S initiation complex
d. then it combines with 60S ribosomal subunit to form 80S initiation complex (80 S/Met-tRNA/mRNA)
At the end of initiation, three sites such as aminoacyl tRNA binding site (A-site), peptidyl tRNA binding site
(P-site) and exit site (E-site). Met-tRNA binds to P-site and A site is free.

2. Elongation
Elongation involves several steps catalyzed by elongation factors (EFs).
a. binding of amino acyl-tRNA to ‘A site’ assist by EF and GTP
b. peptide bond formation of amino acids occupying in A site and P site catalyzed by Peptidyl transferase
(ribozyme)
c. eEF-2 (translocase) helps to move ribosome on the mRNA from 5' to 3' direction by hydrolysis of GTP
d. Thus, tRNA-peptide chain moves to P site. The uncharged (free) tRNA originally in P site moves to the E
site.
e. The whole process recycles for addition of the next amino acid. For each peptide bond formation, 4 high
energy phosphate bonds are used.

3. Termination
After multiple cycles of elongation, polymerization of amino acids to form protein is terminated by
appearance of stop codon at A site. Normally there is no tRNA with anti-codon capable of recognizing
termination codon. But Releasing factors (eRFs) recognize stop codons (UAA, UAG & UGA) on mRNA. RFs
29 2022 Molecular Biology

together with peptidyl transferase cause hydrolysis of the bond between the peptide and tRNA occupying the
P site. The newly synthesized protein, ribosomal subunits, tRNA and mRNA are dissociated from each other.
4. Posttranslational Modification of Protein
Most proteins require post-translational modification to become biologically active form.
• Proteolytic cleavage for conversion of preproprotein or proprotein to active protein (preproinsulin)
• Enzymatic glycosylation – carbohydrate are attached to serine or threonine residues e.g., in hormone
receptor, Ig
• Hydroxylation amino acid residues e.g. lysine to hydroxylysine in collagen
• Gamma carboxylation e.g. glutamic residues in prothrombin with vitamin K as cofactor
• Covalent modification; acetylation, phosphorylation, methylation, ubiquitylation

The energy requirement for peptide bond formation


a. Charging of tRNA with aminoacyl moiety ATP → AMP + 2Pi
b. Entry of aminoacyl tRNA into A site GTP
c. Translocation of newly formed peptidyl tRNA in A site to P site GTP

Biomedical importance of Protein Synthesis


1. Many inhibitors of prokaryotic protein synthesis are antibiotics.
2. Diseases associated with inhibition of eukaryotic protein synthesis
30 2022 Molecular Biology

Diphtheria toxin catalyzes the ADP-ribosylation of eEF-2 and inhibits mammalian protein synthesis.
Polio virus and other picona viruses can synthesize its protein synthesis but inhibits the host protein synthesis
by disrupting the function of eIF4F complex.
[eIF4F is a combination of eIF4E, eIF4G, eIF4A. 4E is responsible for recognition of mRNA cap
structure. Its activity is inhibited by binding of inhibitor protein 4EBP1 preventing the formation of
eIF4F.]
3. Protein synthesis is affected by many factors.
The machinery of protein synthesis can respond to environmental threats. Insulin and many growth
factors stimulate eIF4F-cap mRNA complex formation by phosphorylating 4EBP1 and thus enhance protein
synthesis.

Differences between Prokaryotic and Eukaryotic gene expression


1. Different structure of transcriptional unit
o In Prokaryotes, the transcriptional unit (structural region) generally contains multiple protein regions.
On transcription, the resultant mRNA are polycistronic and produces a long polycistronic proteins
which is cleaved at specific sites to several proteins for viral function. E.g., HBV, polio virus and
HAV.
o In Eukaryotes, the structural gene code single protein coding region and the resultant mRNA is
monocistronic which translates only one protein.
2. Compartmentalization of transcription and translation
o In prokaryotes, both transcription and translation occurs in cytoplasm. In eukaryotes, transcription
occurs in nucleus and translation in cytoplasm.
3. Modification of mRNA
o Prokaryotic mRNA are not capped or tailed at 5’ and 3’ ends. But eukaryotic mRNA have 5’ cap and
3’ tail and introns are spliced out.

Protein Folding
To become functionally active, newly synthesized protein must be non-covalently folded with the help of
chaperons (a group of specialized protein). It is an ATP dependent mechanism.
Polysome
Many ribosomes can translate the same mRNA molecule simultaneously. Multiple ribosomes on the same
mRNA molecule form a polysome or polyribosome.
Mitochondrial protein synthesis has own protein synthesizing machine.
31 2022 Molecular Biology

Protein Targeting
Signal Hypothesis in synthesis of Export Protein
Proteins destined for cytoplasm or nucleus are translated primarily on free polyribosomes but those
for membrane and for secretion into extracellular space are translated on polyribosomes of rough endoplasmic
reticulum.
In translation of secretory or membrane proteins, shortly after the signal sequence is synthesized, it is
recognized by a signal recognition particle (SRP). SRP-signal peptide-protein complex binds to SRP receptor
on ER membrane. Then SRP is released and the ribosome binds to translocon (protein coding channel) and the
signal peptide inserts into the translocon. The growing polypeptide chain is then fully translocated across the
membrane due to its ongoing protein synthesis. The signal peptide is cleaved by signal peptidase and is
degraded. The peptide is released into the lumen of ER after completion of protein synthesis. Then protein
folding and modification occurs in ER and further modifications occur in golgi apparatus. Then the protein is
distributed to membrane or secreted extracellular.

Mis-folded protein and diseases


Neuro- degeneration – Hungtinton disease and Alzheimer’s disease
Prion disease (proteinaceous infectious only)
Unfolding protein with exposed hydrophobic patches leads to aggregate with other normal protein. It
can survive, grow and highly resistant to proteolysis. Accumulation of these proteins causes severe cellular
damage and death. It can spread from one organism to another.
• Scrapie in sheep
• Creutzfeld – Jacob disease (CJD) in human
• Bovine spongiform encephalopathy (BSE) in cattle
32 2022 Molecular Biology

Protein synthesis Prokaryotic Eukaryotic


mRNA Polycistronic Monocistronic Prokaryote Eukaryote
Initiator tRNA N- fMet - tRNA Met- tRNA Replication DNAP I, II, III DNAP α, β, δ, γ, ε
Initiator codon AUG or GUG AUG Transcription RNAP RNAP I, II, III
Initiation factor IF 1, 2, 3 eIF 1, 2, 3 Initiation σ (sigma) Transcription factors
Elongation factor EF Tu, Ts eEF 1α Termination ρ (rho) Ρ independent
Translocation EF G eEF 2
Termination RF 1,2, 3 RF

Protein Prokaryotic Eukaryotic


synthesis
mRNA Polycistronic Monocistronic
No cap and tail Both cap and tail
Initiation 30 S subunit binds to Shine – Dalgano 40 S subunit associates with 5’cap
sequence on mRNA on mRNA
Initiator tRNA N- fMet – tRNA Met- tRNA
Initiator codon AUG or GUG AUG
Initiation factor IF 1, 2, 3 eIF 1, 2, 3
Peptidyl 50 S subunit 60S subunit
transferase
Elongation factor EF Tu, Ts eEF 1α
Translocation EF G eEF 2
Termination RF 1,2, 3 RF

Regulation of gene expression

The genetic content of somatic cells of an organism is the same but not express in all tissues but are
tissue specific in nature (Tissue specific gene expression). Moreover, the organisms can alter gene expression
in response to a variety of changes and stimuli. Not all genes are expressing all the time.
Tissue specific gene expression and gene expression is influenced by genetic developmental
programs, hormones, growth factors, heavy metals, metabolic state and environmental challenges and
diseases. Dysregulation of gene expression can lead to human disease. Thus molecular understanding of these
processes can lead to development of therapeutic agents.
2 Types of genes
1. Constitutive gene or house-keeping gene e.g. enzymes of glycolysis
2. Inducible gene – gene expression is induced or repressed according to the need of the metabolism
33 2022 Molecular Biology

2 types of gene regulation


1. Positive regulation
2. Negative regulation
Regulation of gene expression can occur at following levels.
a. transcription (major control site)
b. post-transcriptional
c. translation and
d. post-translational

Operon systems in E. coli


1. Lac operon
2. Arabinose operon
3. Tryptophan operon
4. Histadine operon
5.
34 2022 Molecular Biology

Regulation of gene expression in Prokaryotes

The major locus of controlling gene expression in prokaryotes is at the transcription level. Operon
model is described by Jacob and Monard in 1961. The negative & positive control gene expression system can
be explained with E. coli lac operon. Other explanations on tryptophan and arabinose operon are also present.
In prokaryotes, the genes involved in a metabolic pathway are often present in a collected group
called operon. The operon is composed of structural genes, controlled elements, regulator/inhibitor gene,
operator and promoter area. The cluster of genes under an operon can be regulated by a single promoter or
regulatory region.
The structural region contains 3 structural genes present as a continuous segment. Adjacent to the
promoter gene is the lacI gene which encodes and constitutively produces repressor protein. Protein products
of the structural genes of lac operon are involved in the metabolism of lactose.
✓ Z gene codes for β galactosidase, which acts on lactose to produce glucose and galactose
✓ Y gene for lactose permease, which actively transports lactose and galactose into cell
✓ A gene for transacetylase
When E.coli is grown in a medium containing glucose, lac genes are repressed since utilization of
glucose is preferred. The lac genes are de-repressed only after glucose has been depleted form the medium,
and the bacterium utilizes lactose to supply usable energy glucose.

Normally, lac genes are repressed.


In the absence of lactose, repressor protein binds to the operator site. Thus it interfere the
identification and binding of RNA polymerase to promoter site or prevents the movement of the attached
RNAP. Thus, genes can’t be transcribed. [Repressor proteins also have four allosteric sites for binding 4
molecules of lac inducer (lactose).]
When glucose is depleted in the medium, the organism then temporarily stops growing until the genes
of the lac operon become induced to provide proteins that metabolize lactose to supply a usable energy
glucose.
De-repression of lac operon
It requires both inducer molecule and positive control element. Gene expression only occurs in the
presence of both. Inducer (lactose) can enter even in the absence of permease and bind with repressor protein
and reduces its binding affinity to operator site.
Induction
With the depletion of glucose, adenylate cyclase is activated and causes increased cAMP. cAMP
binds to cAMP activator protein or CAP. cAMP-CAP complex binds to cAMP response element (CRE) of
promoter. It causes efficient binding of RNAP to promoter. Then the transcription of lac operon begins and
polycistronic mRNA can be translated into corresponding proteins: β –galactosidase, permease and
transacetylase.
35 2022 Molecular Biology

Regulation of gene expression in Eukaryotes

Eukaryotes differ in 3 ways from prokaryotes


1. Binding of regulatory protein to enhancer or silencer
2. Require general transcriptional factors
3. Package of eukaryotic DNA into chromatin

I. Transcriptional regulation (major)


[Chromatin remodeling, DNA binding proteins, Hormonal control]
1. Gene amplification
2. Gene rearrangement
II. Post transcriptional regulation
Alternate splicing of mRNA, Editing of RNA, Transport mRNA
III. Translational regulation
1. mRNA stability
[Regulating mechanism during translation, RNAi]
IV. Post translational Regulation
1. Protein modification
2. Protein compartmentalization
3. Protein stabilization
36 2022 Molecular Biology

I. Transcriptional control
Control of gene expression in eukaryotes is primarily at the level of transcription.
1. Histone modification
It regulates the chromatin structure and accessibility of DNA by gene regulatory proteins. Reversible
acetylation at lysine residue of core histones by acetylase weakens the strength of histone-DNA interaction
and relaxation of nucleosome. It facilitates binding of other regulatory proteins and RNA polymerase to
specific elements of DNA and commerce transcription. Conversely the removal of acetyl groups by
deacetylase promotes the condensation of chromosomes and inhibits transcription.
[Methylation of base (cytidine) is generally associated with inactivation of gene expression. Demethylation of
promoter or of a coding sequence of the gene is required for efficient gene expression.]

2. Certain DNA elements enhance or repress transcription


In some genes, certain DNA elements such as enhancer, silencer and response elements regulate
transcription. They react with regulatory proteins like transcription factors, trans-activating factor) and
intracellular hormone receptor causing transcription or gene silencing.
The gene regulatory proteins have specific domains and bind with high affinity and specificity to the
correct region of DNA. Four common classes of DNA binding domains are
(a) Helix - turn- helix – simplest and most common
(b) Zinc finger - It is present in steroid hormone receptor. Mutation in vitamin D receptor results in
vitamin D resistance and rickets.
(c) Leucine zipper - Present in CREBP (cAMP response element binding protein)
(d) Helix loop helix
Proteins containing leucine zipper motif (or) helix loop helix motif have DNA binding efficacy when they
become dimer. Protein kinases are essential for their dimerization.

3. Gene amplification
Under certain conditions, single copy genes are amplified to many folds during development or
response to drugs. Cancer cell resistant to anti-cancer drugs is due to gene amplification. e.g., methotrexate
increases the number of genes for dihydrofoalte reductase.

II. Post transcriptional regulation


1. Alternate splicing of mRNA
Cell can splice the hnRNA in different ways. Thereby different polypeptide chains can be produced
from single pre-mRNA. The proteins may differ by only a few amino acids or may have major differences or
have different biological roles.
37 2022 Molecular Biology

[This regulation is seen in cell or tissue specific or at certain stages of development or under certain
conditions. For example, the product of same gene is calcitonin in parafollicular C cell and protein involved in
taste sensation in brain cells.]
2. Editing of RNA
It is the alteration of the sequence of nucleotides in the mRNA. RNA editing involves the enzyme
mediated alteration of RNA before translation. The substitution of one nucleotide for another can results in
tissue specific differences in transcript. For example, CAA (glutamine) to UAA (termination codon) in the
apolipoprotein B mRNA produces Apo B48 (2158 amino acids- 48%) in enterocytes instead of Apo B100
(4536 amino acids) in liver.

3. Transport of mRNA
Mature mRNA bound with proteins are transported to cytoplasm through nuclear membrane pore. 3'
UTR (untranslated region) of mRNA is important for mRNA localization in cytoplasm.

III. Translational regulation


1. mRNA stability
Eukaryotic mRNA is more stable than mRNA of prokaryotes. Sequences at the 3’end of mRNA
appear to determine its half-life. UTR binding proteins also affect the gene expression by controlling the
stability of mRNA.
[E.g., iron response element (IRE) located at 3’end of transferin receptor mRNA determine its stability. When
iron level is low, IRE- binding protein binds to IRE and prevents its degradation. When iron level is high, iron
binding to IRE- BP causes low affinity to mRNA and mRNA is degraded.]

2. RNA interference (RNAi) or post-transcriptional gene silencing (PTGS)


It is controlled by very small non-coding RNAs about 20-30 nucleotides long known as micro RNA
(miRNA) and small interfering RNA). si RNAs are part of an enzyme complex that targets and cleaves
mRNAs with high specificity and down regulate gene expression of selected mRNA. miRNAs have imperfect
recognition and there act upon a larger number of targets mRNA.
[dsRNA is diced by a ATP-dependent ribonuclease (Dicer) into short interfering RNAs (siRNAs). siRNAs are
transferred to a second enzyme complex, designated RISC for RNAi-induced silencing complex. The siRNA
guides RISC to the target mRNA, leading to its destruction. Attenuation or repression of translation occurs by
binding to 3ÚTR.]
RNAi is used as a form of primitive immunity to protect the genome from invasion by exogenous
nucleic acids and mobile genetic elements, such as viruses and transposons. Anti-sense gene therapy by using
anti-sense oligonucleotides (anti-sense DNAs)
o As Anti-cancer agent in treatment of acute myelogenous leukemia
o As anti-viral agent in the treatment of AIDS.
38 2022 Molecular Biology

3. Translation regulation
eIF 2 and eIF4 are the focus of this regulatory mechanism. Activity of these proteins can be controlled
by phosphorylation. Starvation and hormones control these mechanisms. Some viruses inhibit host protein
synthesis.
[E.g., In reticulocytes, globin chain synthesis is regulated at translation level. eIF2 is inactive when
phosphorylated by kinase. Heme prevents phosphorylation of eIF2 by binding with kinase (inactive). Thus
elevated level of heme favors translation of globin chain.]

IV. Post translational modification


Proteins are degraded by proteolytic degradation, generally by Ubiquitin dependent proteasome and
lysosomal systems. The PEST (proline, glutamate, serine and threonine) sequence marks some for rapid
turnover. Proteins with N-terminal arginine generally have short half-life compared to proteins with N-
terminal methionine.

Epigenetics
The literal meaning is on top of or in addition to genetics. These regulatory mechanisms do not
change the regulated DNA sequence but change the expression pattern of this DNA.
Mechanisms underlying genetics
Every cell in the organism carries an identical genome, however the terminal phenotype within an
organism is not fixed and deviation is caused by gene expression changes in response to environmental cues.
DNA methylation, histone modification and RNA associated silencing are the major ways of controlling by
epigenetics.
39 2022 Molecular Biology

Prokaryote Eukaryote De no vo synthesis Proof- reading


Replication DNAP I, II, III DNAP α, β, δ, γ, ε DNAP (-) (+)
Transcription RNAP RNAP I, II, III RNAP (+) (-)
Initiation σ (sigma) Transcription factors
Termination ρ (rho) Ρ independent

Genomic instability

• Genomic instability is due to mutation of DNA.


a. Vertical transmission; If mutation occurs in germ cells, inherited diseases will develop. e.g., HbS,
thalassaemia
b. Horizontal transmission; If mutation occurs in Somatic cells, it causes defective cell growth and
differentiation, cancer and metabolism of the cell.

Genetic disorder

A genetic disorder is a disease caused by abnormalities in an individual’s genetic material (genome).


Four Types of genetic disorders
40 2022 Molecular Biology

1. Single gene or Mendelian or monogenic disorder


Changes or mutations that occur in the DNA sequence of one gene and produced protein lacks its normal
function. It can be autosomal dominant or recessive or X linked disorder. E.g., Cystic fibrosis, sickle cell
anemia, G6PD deficiency
2. Multifactorial disorder (complex or polygenic)
It is caused by combination of environmental factors and mutations in multiple genes. Some of most
common chronic diseases are multifactorial disorder. E.g., heart disease, high BP, Alzheimer’s disease,
arthritis, diabetics, cancer and obesity
3. Chromosomal disorder
Mutation in number or structure of a chromosome of genome causes the disease E.g. missing or extra
copies, gross break and rejoining. Trisomy 21 (Down syndrome) is due to dysfunctional meiosis.
Philadelphia chromosome; translocation of long arms of chromosome 22 to 9 causes chronic myeloid
leukemia. As a result, BCR (breakpoint cluster region) gene of chromosome 22 fuses with ABL gene (encodes
tyrosine kinase) of chromosome 9, directing synthesis of chimeric protein with unregulated tyrosine kinase
activity and cell proliferation. Tyrosine kinase inhibitor e.g. Imatinib (Gleevec) is used to treat CML.
4. Mitochondrial disorder
Although it is rare, rate of mutation is 10 times more frequent than nuclear DNA. Mt-DNA is a small
circular which is contributed from ovum only. Mutated mitochondria are transmitted by maternal non-
medelian inheritance.

Gene Mutation

Any permanent change in the nucleotide sequence of a gene is called gene mutation. Single nucleotide
of more than one nucleotide of the gene can change. It may be heritable or non-heritable. It may effect on
pattern of gene expression or on function of proteins.

I. Point mutation or single base change or nucleotide substitution


Single nucleotide (base) of a gene is changed or altered. Transitional mutation is changing or
alteration of single base occurs between same base group. Transversional mutation is changing or alteration of
single base between different base groups. If the gene containing single point mutation is transcribed, mRNA
will contain changed base in codon.
Single base change in mRNA may have one of several effects when translated into protein.
41 2022 Molecular Biology

Effects of point mutation


a. No detectable effect of Silent mutation
Mutation occurs in 3rd nucleotide of the codon and amino acid sequence of protein is normal due to
degeneracy property of codon. The protein will perform normal function. E.g., Change from GUU to GUA
not alters amino acid and both code valine
b. Missense mutation
A base change in 1st or 2nd position of a codon causes incorporation of different amino acid in the
resulting protein at the corresponding site. The function of protein may be acceptable, partially acceptable or
unacceptable depending on location of changed amino acid in the protein.
(1) Acceptable mutation – the resulting protein molecule does not alter the normal function
e.g., Hb Hikari in which asparagine substitutes lysine at 61 position of β globulin chain but perform
normal function. Isoenzymes produced by such mutation will have same catalytic activity with different
properties.
(2) Partially acceptable mutation – the protein molecule has some normal as well as abnormal function.
E.g. HbS caused by valine (non-polar) substitution for glutamate (acidic) in position 6th of β globin chain.
GAA to GUA, GAG to GUG. Although HbS carries oxygen, it becomes less soluble and polymerize in
deoxygenated state and causes sickle shape RBC.
(3) Unacceptable mutation – the resulting protein molecule can’t perform its function e.g. HbM (Boston) in
which hisitidine at 58th position of β chain is replaced by tyrosine cannot transport O2.
c. Nonsense mutation
Change in nucleotide sequences of a codon results in formation of nonsense or termination codon.
The protein synthesis will stop prematurely and protein fragments could not perform their functions e.g.,
thalassemia, HbMckees Rocks
II. Frame shift mutation
Insertion or deletion of a single nucleotide or nucleotides in the coding strand of a gene results in
altered reading frame of mRNA. There will be garbled translation beyond the point of mutation and generates
proteins with altered amino acid pattern i.e addition or deletion of amino acids in the given protein. The
function of protein is totally changed. E.g., Hb Craston has 157 amino acids instead of 146 in beta globin
chain, some form of thalassemia
III. Trinucleotide repeat mutation
It is characterized by amplification of a sequence of three nucleotides which can differs in various
disorders but all share the nucleotide guanine and cytosine. E.g., in Fragile X syndrome, CGG repeat is seen
on X chromosome.
[Splice-site mutation- Any changes in introns- exon junction may cause the synthesis of an abnormally
spliced protein or could not to produce proteins.
42 2022 Molecular Biology

Promoter mutation - Mutation can occur in promoter or other regulatory site. Structure and function of
proteins are intact but there is change in rate of protein synthesis.]

Stem cell

Stem cells are unspecialized cells found in most multi cellular organism.
Two important characteristics
1. Self- renewal- Can replenish their number for long periods through cell division
2. Potency- After receiving certain chemical signals, can differentiate or transform into specialized cells

Classification of stem cells depending on differentiation potential or potency of the cell


1. Totipotent stem cells
▪ Produced by the 1st few division of the fertilized egg (blastocyst or morula cells)
▪ Can differentiate into embryonic or extra- embryonic cell types (placenta)
2. Pluripotent stem cells
▪ Descendents of totipotent cells
▪ Can differentiate into 3 germ layers
3. Multipotent stem cells
▪ Can differentiate into a limited number of types
▪ e.g., hematopoietic stem cells into RBC, WBC, platelets
▪ Neural stem cells into nerve cells and glia cells
4. Unipotent stem cells (progenitor cells)
▪ Can produce only one cell type
▪ e.g. , erythroid progenitor cells into only red blood cells

The common stem cells used in research


1. embryonic stem cells (toti and pleuripotent stem cells) – from in vitro fertilization
2. adult stem cells (multi and unipotent stem cells)
Uses of stem cells
• Scientists believe that the introduction of healthy stem cells into a patient may restore the damaged
organ with good function.
• Adult stem cells in treatment of leukemia, bone marrow and blood cancer through bone marrow
transplant
• Trial on treatment of cancer, Parkinson’s diseae and spinal cord injuries

Ethics in Stem Cell Research (ESCR)


43 2022 Molecular Biology

• A key ethical concept is the moral status of the embryo. The right of a fetus at any particular stage is
balanced against the potentially large benefits that others may gain from research and ultimately, stem
cell-based treatments.
• Use of pre-14 day embryo, still little more than a ball of cells remains justified.

[Production of induced Pleuripotent stem cells (iPS) from somatic cells


It is scientific breakthrough of stem cell research. It leads to gain of the Noble price in Medicine or
physiology in 2012. Some proteins produced form proliferative genes can transform the mature somatic cells
into pleuripotent stem cells. They are Oct4, Klf4, Sox2 and c-Myc. Candidate somatic cells used for
reprogrammed into iPS are Neuronal progenitor cells, Keratinocytes, Hepatocytes, Fibroblasts, B
lymphocytes, Gastric epithelial cells. As a Benefit, Stem cells can be obtained easily and Stem cell therapy
become more accessible. Drawback is Development of abnormal cell growth after stem cell therapy]

Cellular Growth control and Cancer

Control cell growth is achieved by a complex and fine tuning of many proteins that regulate cell proliferation.
They are divided into 4 groups
a. Growth factors
b. Growth factor receptors
c. Intracellular signal transducers including nuclear receptor, Cell cycle control proteins
d. DNA repair proteins

Normal cell growth and differentiation is carried by coordinated mechanism of proliferating genes or
proto- oncogenes and tumor suppressor gene (differentiation and growth inhibition). Proto-oncogenes encode
various proteins that are involved in normal growth and division of cells e.g., growth factor, Ras, MAPK,
cyclins, DNA binding proteins. Tumor suppressor genes encode proteins that normally suppress cell growth
e.g., retinoblastoma protein, p53. Loss of function mutation of tumor suppressor genes could not inhibit
abnormal cell growth. Gain of function mutation of proto-oncogenes enhances cell proliferation.

Growth factor Sis


Growth factor receptor Erb
G protein Ras
44 2022 Molecular Biology

Derangement in the controls on cell’s proliferation, Tyrosine kinase Src


differentiation and survival cause cancer. A single mutation is not DNA binding protein myc
sufficient to convert to a healthy cell into tumor cell. Several
mutations have to occur together.

Characteristics of cancer cells


1. Proliferate rapidly
2. Display diminished growth control
3. Display loss of contact inhibition in vitro
4. Invade local tissues and spread, or metastasize to other parts of the body
5. Are self-sufficient in growth signals and are insensitive to anti-growth signals
6. Stimulate local angiogenesis
7. Are often able to evade apoptosis

Oncogenes

Oncogenes are mutated proliferating genes derived from proto-oncogenes. They are 1st recognized in
virus. Their products; oncoproteins cause aberrant gene regulation to cause gain of function or inappropriate
regulation of normal cell growth leading to cellular transformation.

Proto-oncogenes are activated to oncogenes by several ways:


1. Mutation e.g., point mutation in RAS oncogene results small GTPase and stimulation of the activity of
adenylyl cyclase occurs.
2. Promoter insertion e.g., insertion of viral promoter region in a proliferating gene activates gene.
3. Enhancer insertion e.g., insertion of viral enhancer region in a proliferating gene activates gene.
4. Chromosomal translocation e.g., Burkitt lymphoma, Philadelphia chromosome
[The Myc gene contained in small piece of chromosome 8 is transferred to chromosome 14. Thus, it is
5. Gene amplification e.g., abnormal proliferation of a gene results in many copies of oncogenes and genes
involved in tumor drug resistance

Mechanism of oncogene
Three general mechanisms by which products of oncogenes (oncoproteins) stimulate growth and division of
cells
1. May imitate the action of growth factor
2. May become an occupied receptor for growth factor
45 2022 Molecular Biology

3. May act as key intracellular points involved in growth control e.g., src acting protein, myc acting as DNA
binding proteins

Tumor markers

Many cancers are associated with the abnormal production of molecules: enzymes, hormones or proteins.
They are known as tumor markers and can be measured in plasma or serum. Tumor marker is a biological
substance which can be produced directly by the tumor or non- tumor cells as a response to the presence of
tumor.

Points to be noted about tumor markers


1. No single marker is useful for all types of cancer or for all patients with a given types of cancer
2. Markers are more often detected in advanced stages of cancer rather than early stage whereby they
would be more helpful.
3. Also elevated in blood of patients with non-cancerous disease e.g., CEA GI disorder, PSA in
prostatitis and BPH

Markers Associated cancer


Carcinoembryonic antigen ( CEA ) Colon, Lung, Breast, Pancreas, stomach
Alpha fetoprotein ( AFP ) Liver, non-seminomatous testicular germ cell tumor
Human chorionic gonadotropin ( hCG ) Trophoblastic tumors, choriocarcinoma
Calcitonin ( CT ) Thyroid ( medullary carcinoma)
Prostatic acid phosphatase ( PAP ) Prostate
Prostate specific antigen (PSA) Prostate
CA -125 Ovarian cancer
CA -19-9 Pancreatic cancer
Placental alkaline phosphatase Seminoma
S -100 Melanoma, neural derived tumors, astrocytoma
Tartrate-resistant acid phosphatase (TRAP) Hairy cell leukemia
Monoclonal Ig Myeloma

Recombinant DNA Technology or Genetic Engineering

Recombinant or chimeric or hybrid DNA is an altered DNA due to the insertion of a sequence of
deoxyribonucleotide, not previously present, into existing molecules of DNA, by enzymatic or chemical
means.
46 2022 Molecular Biology

Genetic recombination
Genetic recombination is the process whereby new linkage relationships are established between genes.
1. General recombination – exchange between homologous chromosome in meiosis
2. Site-specific recombination - Integration of viral into host genomes

Common techniques used in Genetic analysis


1. A specific gene can be isolated from chromosome. Cleavage of large DNA molecules into small DNA
fragments by Restriction endonuclease that cleaves both strands at specific sequence.
2. DNA segments formed are separated by gel electrophoresis and the desire segment can be isolated.
3. The desired DNA can be amplified by cloning or PCR. The DNA fragment is joined to vector DNA and
then the chimeric DNA is multiplied within bacteria. Polymerase chain reaction is in vitro method of
amplifying a target DNA within a relatively short time.
4. Nucleic acid hybridization is used to find a specific sequence of DNA or RNA.
5. Direct sequencing of DNA to know the base sequence of the DNA fragment

Cleavage of desire DNA fragment by Restriction endonuclease


It is a choice of tool for DNA fragmentation. They are produced by bacteria as a defense against DNA
viruses. It cleaves DNA in specific at palindromic sequence. Palindrome is a sequence in which both strands
of DNA have the same sequence when read in a 5’to 3’direction. There are two types of cutting: blunt end or
sticky end. Name according to isolation e.g., EcoR1
• 1st letter (E) : genus name
• 2nd letter (co): species name
• 3rd letter (R) : strain name
• 4th letter (1): order of discovery

Restriction map is the diagrammatic representation of DNA molecule indicating the site of cleavage by
various restriction enzymes.
Gel electrophoresis
It determines the length and purity of DNA molecules as DNA carry negative charge. Polyacrylamide or
agarose gel is used. For visualization of DNA, Radio isotopes 32P and ethidium bromide (dye) are commonly
applied on DNA.
Nucleic acid hybridization & blotting
It is a method using hybridization of oligonucleotide probe (DNA or RNA, 100-300bp long), marked
by radioisotope or chemical to complementary sequence in the sample. This is used to identify presence or
absence of a particular DNA or RNA, amount of RNA transcribed and altered DNA.
47 2022 Molecular Biology

A probe is a single stranded DNA or RNA that is complementary to the target DNA. To be
detectable, the probe must be labeled with either a radioactive isotope or a fluorescent group.
Nucleic acid hybridization
1. Southern blot – Hybridization of a probe to the bound DNA on cellulose membrane
2. Northern blot - Hybridization of a probe to the bound RNA on cellulose membrane

Restriction fragment obtained from genomic DNA are separated by gel electrophoresis. Blots are
created by laying a membrane over one face of the gel and then creating a flow which carries the molecules in
the gel onto the nitrocellulose membrane. The nucleic acid fragments transferred to the membrane are then
hybridized by labeled specific probe. The probe binds to the fragment having complementary bases and the
hybrid area can be visualized by an appropriate method like exposure to X ray film, by UV or densitometer.

Protein hybridization - Western blot


It is a detection method for visualization of polypeptides and proteins. Although the procedures are
similar with nucleic acid hybridization, it is based on antigen- antibody reaction. Detector (enzymes or radio
isotope) tagged antibody are used as probes.
[Fluorescent in-situ hybridization (FISH)
It is a procedure for detection of deleted gene or mutant gene in intact chromosome. The chromosome
is spread in metaphase cell. Even interphase cells can be used. The chromosomal DNA is denatured and a
fluorescent tagged probe is applied. It is widely used in clinical practice e.g., detection of HER2/neu gene in
breast cancer.
Dot blot
It is a rapid, inexpensive screening test for the detection of small mutation and polymorphism. It is
done by synthetic oligonucleotide probe at least 17 or 18 bp. Two probes are required, one for normal and one
for mutation. It can be carried in any phase of the cell. ]

Complementary DNA (cDNA)


It is mostly used for probe for hybridization techniques and making of gene library. cDNA probe is
made from mRNA by reverse transcriptase.
Library
A collection of the different recombinant clones is called a library.
a. Genomic library is prepared from the total DNA of a cell line or tissue.
b. cDNA library comprises complementary DNA copies of the population of mRNAs in a tissue.
Polymerase Chain Reaction (PCR)
• Kary Mullis inmid-1980
48 2022 Molecular Biology

It is an in vitro/test tube method of amplifying a target sequence of DNA molecule. The process is
based on the replication but many steps are overcome by different ways.
Millions of DNA copies can be obtained by PCR within a few hours. The sample is from very small
amount of genomic DNA as a drop of blood, hair.
The materials required for PCR are
1) DNA of interest
2) Two complementary oligonucleotide DNA primers (20-25 bp) that flanking the target DNA sequence.
3) heat stable DNA polymerase, Taq polymerase (Thermas aquaticus)
4) mixture of 4dNTPs; dATP, dGTP, dCTP and dTTP
5) reaction buffer containing enhancer and magnesium
6) Thermal cycler
The steps involved in PCR are;
1. Denaturation
The mix is heated to above 94 - 98C for 5 minutes in order to denature the target DNA.
2. Primer annealing
Then cool down to 50 - 65 C to allow the primers anneal to the complementary sequence in test DNA.
3. Primer extension
Then, temperature is raised again to optimal temperature of Taq polymerase (72C) for synthesizing
new DNA strands by using complementary dNTPs.
This set of three steps can be considered as one cycle. The target DNA is replicated in each cycle.
Exponential increased in DNA copies with repeated cycles as much as 30 to 60 cycles. [PCR can be used to
amplify DNA from buccal smears, single hairs, blood spots, body fluid secretion, fetal blood, chorionic villi
and amniotic fluid, paraffin embedded tissue or fossil. ]

Application of PCR
(1) Detection of infected agents (bacterial or viral DNA/RNA) in blood or body fluid especially in latent
period
(2) To detect the allelic polymorphism, detection of mutation and restriction length polymorphism
(3) To make Prenatal diagnosis of genetic diseases
(4) It is useful in recombinant DNA technology and DNA sequencing
(5) To study ancient DNA and evolution, using DNA from archeological samples
(6) To establish Precise tissue types in organ transplantation
(7) In forensic medicine, it is useful to distinguish person to person, from specimen usually found at the scene
of crime by DNA finger printing
(8) For RNA analysis after RNA copying (reverse transcription-PCR), and mRNA quantitation by real time
RT-PCR
49 2022 Molecular Biology

(9) To detect Single nucleotide polymorphism (SNPs) Polymorphism is defined as any DNA sequence variant
for which the population frequency of the less common allele is more than 1%
Advanced PCR systems for determination of quality and quantity of nucleic acids
• Reverse Transcriptase PCR (RT– PCR)
• Real time PCR (quantitative PCR)

DNA cloning
A clone is a large population of identical molecules or cells that arise from a common ancestor.
Molecular cloning is the propagation and multiplication of selected DNAs in microorganisms.
DNA fragment of interest is ligated to vector DNA and chimeric DNA is inserted into host bacteria.
The DNA fragment is cloned as bacteria grows and divides. Most methods use a cloning vector; plasmid,
bacteriophage or artificial chromosomes. The microorganisms containing specific DNA sequence are
identified by hybridization method. DNA cloning is mostly used for overexpression of protein.

Cloning vectors
1. Plasmids
Plasmids are Small, circular, non-chromosomal duplex DNA in bacterial cells. It Functions to confer
antibiotic resistance to the host cell. Its replication does not depend on chromosomal DNA replication. It can
accept 6 – 10 kb long foreign DNA.
2. Phages
Phages are Organism that infect bacteria. It has Linear DNA molecule and can accept 10 -20 kb long
foreign DNA.
3. Cosmids
Cosmids are Circular DNA molecules which Combine best features of plasmids and phages. It contains
plasmid origin of replication (ori) which allows autonomous replication and a antibiotic marker, a cos site. It
can accept 35 – 50 kb long of DNA
4. Bacterial artificial chromosome (BAC) & Yeast artificial chromosome (YAC) Can accept > 50 kb.
==================================================================
RFLP (restriction fragment length polymorphism)
Restriction endonuclease cleaves dsDNA at the specific sequence producing a characteristic set of
smaller DNA fragments. It the DNA deviated from normal, they produce different fragments and are called
RFLP.
RFLP are due to a variety of mutation. It is used to facilitate prenatal detection of a number of
hereditary disorders including sickle cell trait, thalassaemia and also in human identification. It is inherited in
Mandelian fashion.
Variable number of Tandem repeat (VNTR)
50 2022 Molecular Biology

VNTR is a location in a genome where a short nucleotide sequence is organized as a tandem repeat. It
can be analyzed by RFLP. It is Found on many chromosome. It shows variations in length between
individuals. They are inherited alleles and useful for personal or parental identification, in genetic and
biological research, forensic and DNA fingerprinting and CODIS database

Application of rDNA Technology or Biomedical importance of genetic engineering

1. Molecular analysis of the diseases


➢ Analysis of genetic variation in normal genes or genes causing diseases
➢ Analysis of point mutation, frame shift mutation and rearrangement of DNA e.g. sickle cell anaemia,
thalassaemia
➢ Useful in pedigree analysis and prenatal diagnosis of genetic diseases
➢ RFLP and VNTR as DNA finger printing for identity matching or inheritance matching in forensic
medicine
FBI has standardized a set of 13 VNTR assays for DNA typing and organized the CODIS
database (Combined DNA Index system)
2. Gene mapping
➢ Localization of a gene can be defined by a map of human genome. In situ hybridization is used to
localize the gene on chromosome.
➢ Chromosomal walking is used to find the diseased gene.
3. Production of proteins for research and diagnosis
➢ For Production of vaccine e.g., hepatitis B vaccine
➢ For treatment of disease e.g. interferon, tPA
➢ To provide human proteins e.g. insulin and growthhormone
➢ For diagnostic test e.g. AIDS test, HbS antigen
4. Study of gene expression of tumor tissue by DNA microarrays
- cDNAs from normal tissue are tagged with green fluorescent and from tumor are equipped with red
fluorescent tag. cDNAs from these sources are applied to the microarrays of cDNA under study.
5. Gene therapy
➢ To replace non-functional gene or abnormal gene and to repair abnormal gene. Constructed normal
genes are carried principally by virus e.g., adenosine deaminase deficiency
6. Creation of trans-genic animals
It is used for the experimental use in analysis of gene expression, or the effect of overproduction of gene
products.
[A certain percentage of genes injected into fertilized ovum of experimental animals (mouse). The
inserted gene will be incorporated into the genome the animals.]
Pharmacogenomics
51 2022 Molecular Biology

It is the study of how an individual’s genetic inheritance affects the body’s response to drugs. It is
Combination of Pharmaceutical science and Biochemistry. It can create personalized or tailor-made drugs
with greater efficacy and safety. A person’s response to drug is influenced by environment, diet, age, lifestyle
and state of health and especially genetic make-up.

Gene Therapy

Gene therapy is a technique for correcting defective genes responsible for disease development.
It is insertion or alteration or removal of genes within an individual’s cell.
Steps of gene therapy
- Isolate the healthy gene along with its control sequence
- Incorporate this gene into a gene vector (retrovirus, adenovirus, adeno-associated virus and Herpes
simplex virus)
- Finally deliver the vector to the target cell.

Types of gene therapy


(1) Germ line gene therapy - Target cells are cells from fertilized ovum or eggs or sperms
(2) Somatic gene therapy
Therapeutically used genes are transferred into the somatic cells of a patient. Mongenic diseases or
obvious candidates for this approach (e.g. adenosine deaminase difficiency) First procedure (1992) used
hemopoietic stem cells as vector in Italy. First successful gene therapy (2002) is treatment of Adenosine
deaminase deficiency for severe combined immunodeficiency disease (SCID). Not only genetic diseases are
candidate for the gene therapy but also non communicable diseases (NCD) such as arthritis, atherosclerosis,
and cancer are being attempted for gene therapy.
RNA can also be used as a tool in gene therapy: antisense (the sequence complementary with mRNA)
technology and RNAi.
[Anti-sense oligonucleotides – miRNA and siRNAs typically inhibit gene expression at the level of translation
by targeting on mRNA. The formation of RNA-RNA hybridization due to anti-sense RNA and anti-sense
DNA inhibits translation and promotes RNA degradation.
Ethics in Gene therapy
- Germ line gene therapy is prohibited for application in human beings

Human Genome Project (HGP)


52 2022 Molecular Biology

HGP was started in 1998 and aunched in 1990 and completed in 2003. 18 countries participated in
this project.
The objectives of the International Human Genome Project included;
• Construction of a genetic map
• Identification of all human genes
• Sequencing of the entire genome
• Store the information in detabases
• Improve tools for data analysis – transfer tech to provide sector, raise ethical, legal and social issues arise
from projects

Important Knowledge from HGP


• 99.9% nucleotide - same in all people.
• Over 50% of genes - unknown function "junk DNA”
• The estimated number of genes is 20,000 to 25,000 (20,500 genes)
• Chromosome 1 has the most genes : 2968 and Chromosome Y has the fewest genes : 231.
• Average gene consists of 30,000 bases
• 2% of DNA sequence encodes for protein. (Wheat from the Chaff)
• Exact localization of 1.4 million of SNPs has been known

Benefits
Improve in diagnosis of disease
Early detection of genetic predisposition to disease
Development of rational drug design
Gene therapy
Emergence of pharmacogenomics

You might also like