You are on page 1of 24

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/315698215

INTRODUCTION TO DEOXYRIBONUCLEIC ACID


STRUCTURE, BIO-MOLECULAR OPERATORS, AND DNA
COMPUTING

Chapter · January 2008

CITATIONS READS

0 988

1 author:

Zuwairie Ibrahim
Universiti Malaysia Pahang
318 PUBLICATIONS   1,700 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Optimization View project

Neurosignals classification via time sereies analysis View project

All content following this page was uploaded by Zuwairie Ibrahim on 30 March 2017.

The user has requested enhancement of the downloaded file.


Introduction to deoxyribonucleic acid structure, bio-molecular 1
operators,and DNA computing 
 

1
INTRODUCTION TO
DEOXYRIBONUCLEIC ACID
STRUCTURE, BIO-MOLECULAR
OPERATORS, AND DNA COMPUTING
Zuwairie Ibrahim

1.1 INTRODUCTION  
 
In this chapter, the basic structure of Deoxyribonucleic Acid
(DNA) is presented and described starting from nucleotides, which
is the monomer of the DNA, to the famous helical structure of the
DNA. In order to understand how to make DNA performs the
computation, it is required to study several biochemical reactions
such as DNA hybridization and denaturation, ligation,
polymerization, polymerase chain reaction (PCR), and magnetic
bead separation. Hence, these biomolecular operators, which are
often employed for DNA manipulation in DNA computing, are
also presented. Next, a novel in vitro approach, which has been
proposed by Adleman for solving Hamiltonian Path Problem
(HPP), is presented and some implementation issues of Adleman
DNA computer are discussed.
2   Progress in computation intelligence in vitro and in silico 

1.2 BASIC DNA STRUCTURE


1.2.1 NUCLEOTIDES
DNA is a polymer, which is strung together from a series of
monomers. Monomers, which form the building blocks of nucleic
acids, are called nucleotides. Each nucleotide consists of sugar
ring, a phosphate, and a nucleobase, as shown in Figure 1.1. Sugar
ring is a 5-member of carbon of a nucleotide. Each carbon of
nucleotide is named as 1’, 2’, 3’, 4’, and 5’. A phosphate is
attached to the 5’ carbon of nucleotide and a nucleobase is attached
to the 1’ carbon of nucleotides respectively. In addition, the 3’
carbon is attached to a hydroxyl group (HO).
There are two major classes of nucleotides: RNA
(Ribonucleic Acid) and DNA. The nucleotides are classed based
on the sugar, X, which is attached to the 2’ carbon. For the case of
RNA, the nucleotides contain a ribose sugar (X=OH) whereas for
the case of DNA, the nucleotides contains a deoxyribose sugar
(X=H). Since this research focuses on DNA as a medium of
computation, DNA will be described in more detail in the next
subchapter.

Figure 1.1: A nucleotide. 


Introduction to deoxyribonucleic acid structure, bio-molecular 3
operators,and DNA computing 
 
1.2.2 STANDARD NUCLEOBASE OF DNA

Nucleotides in DNA contain four types of nucleobases. These


nucleobases can be grouped into purines and pyrimidines. Purines
group contains of Adenine (A) and Guanine (G). On the other
hand, pyrimidines group contains of Thymine (T) and Cytosine
(C). Figure 1.2 (a), Figure 1.2 (b), Figure 1.2 (c), and Figure 1.2
(d) show the molecular structure of the nucleobases A, G, T, and C
respectively, and R indicates point of attachment to the 1’ carbon
of DNA [1].

Figure 1.2: (a) A nucleobase of Adenine (A) (b) A nucleobase of


Guanine (G) (c) A nucleobase of Thymine (T) (d) A nucleobase of
Cytosine (C).
4   Progress in computation intelligence in vitro and in silico 

1.2.3 SINGLE STRANDED STRUCTURE

Single stranded DNA (ssDNA) is a linear chain of nucleotides.


This chain, which forms a negatively charged backbone, is linked
by 5’-phosphate with 3’-hydroxyl to form a phosphodiester bond,
which is a strong covalent bond. Hence, each end of a single strand
is uniquely identified by a 3’ and 5’. Figure 1.3 shows how three
different nucleotides are linked to form a single stranded DNA [1].
By looking at the two chemically distinct ends, which is 5’
end and 3’ end, the polarity or directionality of each chain can be
established. By convention, the chain is oriented from 5’ end to 3’
end. Thus, single stranded DNA is normally written according to
the sequence of nucleobases, from 5’ to 3’. As such, a single
stranded DNA, 5’-ATCG-3’ is normally written as ATCG.

1.2.4 DOUBLE STRANDED STRUCTURE

Two respective single stranded DNA can combine with each other,
at a specific condition, to form a double stranded DNA (dsDNA).
This combination is shown conceptually in Figure 1.4 [1]. In this
figure, helical structure of double stranded DNA is omitted. This
intermolecular attachment is subjected to Watson-Crick base
pairing and oriented anti-parallel, which means that they run in
opposite directions. In other words, a 5’-3’ single stranded DNA
could combines, if possible only with a 3’-5’ single stranded DNA.
A unit of double stranded DNA behaves as a single polymer and
can be described in term of number of base pairs (bp).
Introduction to deoxyribonucleic acid structure, bio-molecular 5
operators,and DNA computing 
 
5' End

O-
-
O P O

O
5' Base
H2C O
4'
H H 1'

H 3' 2'
O H
-
O P O

O
5' Base
H2C O
4' H H 1'

H 3' 2'
O H
-
O P O
O
5' Base
H2C O
4'
H H 1'

H 3' 2'
HO H
3' End

Figure 1.3: A single stranded structure of DNA.


6   Progress in computation intelligence in vitro and in silico 

Figure 1.4: A simplified double stranded structure of DNA.

1.2.5 WATSON-CRICK BASED PAIRING

According to the prior discussion, a single-stranded fragment has a


phospho-sugar backbone and four kinds of bases denoted by the
symbols A, T, G, and C for the bases adenine, thymine, guanine,
and cytosine respectively. These four nucleic acids, which can
occur in any order in a single stranded DNA, paired in Watson-
Crick complementarity pairs [2] to form a double strand helix of
DNA. Due to the Watson-Crick complementarity, A is paired with
T by 2 hydrogen bonds, whereas C is paired with G by 3 hydrogen
Introduction to deoxyribonucleic acid structure, bio-molecular 7
operators,and DNA computing 
 
bonds, as shown in Figure 1.5 [3]. Many pairs can form a
hydrogen-bond, which is a weak bond and the point of attachment
to the backbones are equally spaced, as shown in Figure 1.5, to
allow a regular helical structure. The helical structure is shown in
Figure 1.6 [4].

1.3 REVIEW OF BASIC BIOTECHNOLOGY

There are several essential biomolecular operations, which is often


used for manipulating DNA during the computation. Those
operations will be explained in detail in this sub-chapter.

Figure 1.5: (a) G-C base pair. (b) A-T base pair.

Figure 1.6: Helical structure of DNA.


8   Progress in computation intelligence in vitro and in silico 

1.3.1 SYNTHESIZING DNA


Short chemically synthesized single stranded molecules are called
oligonucleotides or simply oligos. There are useful in genetic
engineering as well as in DNA computing. Due to current
technology, 70-80 sequences can be chemically synthesized
without much error. At present, it is possible to get a test tube
containing approximately 1018 DNA molecules with a desired
sequence. Some commercial DNA synthesis companies are
available, which provide a reasonable price for this reason. In
Japan, at present, the price is about ¥80/base.

1.3.2 HYBRIDIZATION AND DENATURATION

Hybridization is defined as a sequence-specific annealing of two or


more single stranded DNAs, forming a dsDNA product. This
sequence-recognition property is very useful for DNA computing
because hybridization means computation, from DNA computing
sense. This operation is normally caused by cooling down the test
tube reaction solution [5].
There are basically three cases, on how the hybridization
could occur: bi-molecular hybridization, multi-molecular
hybridization, and uni-molecular hybridization. For the first case,
bi-molecular hybridization involves two kinds of ssDNAs to form
a native double helix structure of DNA as shown in Figure 1.7 [6].
For the multi-molecular hybridization, three strands are involved
during the annealing. Multi-molecular hybridization is the essence
of Adleman DNA computing for solving an instance of
Hamiltonian path problem. Thirdly, uni-molecular hybridization or
self-hybridization, hairpin formation of ssDNAs could be formed if
a complementary portion exists in the same ssDNAs, as depicted in
Figure 1.8.
Introduction to deoxyribonucleic acid structure, bio-molecular 9
operators,and DNA computing 
 

Figure 1.7: Bi-molecular hybridization and denaturation of DNA.

Figure 1.8: An example of hairpin formation of DNA.


By heating up the solution to about 85-95°C, dsDNAs will
come apart because the hydrogen bonds between complementary
nucleotides are much weaker than the covalent bonds between
nucleotides adjacent in the two strands. The separation is called
melting or denaturation. Thus, two strands can be separated
without breaking the single strands [7], as depicted in Figure 1.7.
The same effect can be achieved by washing the double stranded
DNAs in doubly distilled water.
10   Progress in computation intelligence in vitro and in silico 

1.3.3 LIGATION

Ligation is often invoked after the single DNA strands are


annealed according to the Watson-Crick complementarity. Many
single-strand fragments will be connected in series and ligase is
used as ‘glue’ to seal the covalent bonds between the adjacent
fragments [8]. Figure 1.9 shows the principle of ligation. In this
figure, three kinds of ssDNAs, namely strand A, strand B, and
strand C, taking part during the ligation. Strand A and strand B
must be located adjacently with each other without gap, and
hybridized partially with strand C. The product of ligation is a
‘new’ strand AB. Ligation is generally implemented in laboratory
via a DNA ligase, such as T4 DNA ligase. Moreover, strand A
must have a 5’ PO4 and energy required during the ligation is ATP
or NAD+.

1.3.4 CUTTING DNA BY RESTRICTION ENZYME

It is possible to cut a double-stranded DNA by restriction


endonucleases enzyme. This operation is depends on the sequences
of dsDNAs. Four common cut-sites or restriction sites are shown
in Figure 1.10 [9]. This operation often form a ‘sticky ends’, which
may useful for directing later annealing/ligation in DNA
computing. There are two types of sequence-specific
endonucleases: Type I and Type II. Type I cut at the restriction
site, as shown in Figure 1.10 whereas Type II cut away from
restriction site. The restriction site is symmetry, usually 6-bps in
length and the enzyme cuts both of the backbones symmetrically.
Introduction to deoxyribonucleic acid structure, bio-molecular 11
operators,and DNA computing 
 

Figure 1.9: Ligation.

1.3.5 POLYMERIZATION

The substrates required for polymerization are a template strand to


be copied, a primer strand to be 3’-extended, incoming dNTP
monomers, which act as both base and energy sources, and DNA
polymerase. DNA polymerase implements a 5’ to 3’ copying
operation as depicted in Figure 1.11 [9]. During the copying
operation, 3’ end of a primer strand is extended. Note that there is
no 3’ to 5’ copying operation ever observed. This operation also
depends on Watson-Crick complementarity. In other words, A is
copied to T and G is copied to C, and so on.
12   Progress in computation intelligence in vitro and in silico 

Figure 1.10: Four types of common restriction sites of endonucleases.

1.3.6 POLYMERASE CHAIN REACTION (PCR)

PCR is an incredible sensitive copying machine for DNA. It also


can be used for DNA detection. Given a site-specific single
molecule DNA, a million or even billion of similar molecules can
be created by PCR process. In n steps, it can produce 2n copies of
the same molecules. PCR needs a number of sub-sequence strands
called ‘primers’, which are usually about 20 base long to signal a
specific start and end site at a template for replication. PCR
normally runs for 20-30 cycles of 3 phases: separating base pair
strands of DNA at about 95°C, annealing at 55°C, and extension at
74°C [10]. It takes about two to three hours normally in order to
Introduction to deoxyribonucleic acid structure, bio-molecular 13
operators,and DNA computing 
 
complete the cycles. Figure 1.12 shows the operations of
polymerase chain reaction up to third cycles.

Figure 1.11: Polymerization in action.


14   Progress in computation intelligence in vitro and in silico 

dsDNA to be amplified

separate DNA strands


+
anneal primers

DNA primers

DNA synthesis (primer extension)

FIRST CYCLE: producing 2 dsDNAs

separate DNA strands


+
anneal primers

DNA synthesis (primer extension)

SECOND CYCLE: producing 4 dsDNAs

THIRD CYCLE: producing 8 dsDNAs

Figure 1.12: Polymerase chain reaction.

1.3.7 GEL ELECTROPHORESIS

DNA strands in a solution can be separated in terms of its length


by means of gel electrophoresis. In fact, the molecules are
separated according to their weight, which is almost proportional
to their length [7]. This technique relies on the fact that DNA
Introduction to deoxyribonucleic acid structure, bio-molecular 15
operators,and DNA computing 
 
molecules are negatively charged [11]. Hence, by putting them in
an electric field, they will move towards the positive electrode at
different speed. If electrical field is applied through the gel, longer
molecules will remain behind the shorter ones, as shown in Figure
1.13 [12]. The speed of DNA mixture in a gel depends heavily on
the gel porosity and the magnitude of the electrical field.
Polyacrylamide gel is used for separation of shorter dsDNAs,
which is from 10 bps until 500 bps. On the other hand, agarose gel
is frequently used for longer dsDNAs, which is more than 500-bps.
An example of the output of gel electrophoresis is well depicted in
Figure 1.14. In DNA computing, this technique is used to visualize
the results of computation. Normally, at the end of this process, the
gel is photographed for convenience.

Figure 1.13: Polyacrylamide gel electrophoresis.


16   Progress in computation intelligence in vitro and in silico 

Figure 1.14: Example of a gel image.

1.3.8 DNA EXTRACTION

By exploiting the specificity of hybridization, ssDNAs can also be


segregated by sequence. Let say, one have a DNA mixture T, the
objective of this operation is to remove the subset Ts of strands in T
containing the subsequence S. Figure 1.15 shows an example of
DNA extraction. Based on this example, S = AGCATA. Before the
extraction, biotinylized strand, F with S*, where * denotes Watson-
Crick complementation, is conjugated to streptavidin-coated
magnetic beads. Next, strand F is mixed with the mixture T in
order to allow strands F to hybridize to strands in T containing S.
After the hybridization, the strands F can be removed
magnetically, from the DNA mixture T. At the same time, the
subset of T, which is hybridized with S*, will also be removed from
the DNA mixture T. Lastly, the strand Ts can be recovered by
melting or washing the strand F.
Introduction to deoxyribonucleic acid structure, bio-molecular 17
operators,and DNA computing 
 

Figure 1.15: An example of DNA extraction by using streptavidin-


coated magnetic bead.

1.4 THE BEGINNING OF DNA COMPUTING


1.4.1 DNA COMPUTING FOR HAMILTONIAN PATH
PROBLEM

A non-deterministic algorithm for solving directed HPP is as


follows:
Step 1: Generate all the paths randomly in large quantity.
Step 2: Reject all paths that do not begin with vin and end in vout.
Step 3: Reject all paths that do not involve exactly n vertices.
Step 4: For each of the n vertices v, reject all paths that do not
involve v.
Step 5: The answer is ‘YES’ if any path remains, otherwise ‘NO’.
18   Progress in computation intelligence in vitro and in silico 

Adleman proved that this algorithm can be implemented in


molecular level. At first, each vertex is denoted by a single
stranded sequence of nucleotides of length 20. These codes with
length 20 are enough in order to ensure that the codes are
“sufficiently different”. A set of 20-mer oligonucleotides, or
oligos, for short, which has been used by Adleman, to encode each
vertex, is randomly designed in advance. Oligos encoding vi is
denoted as Oi, while the Watson Crick complementarity of Oi is
denoted as Oi . As for edges, eij, connecting two vertices, from Vi
to Vj, 20-mer DNA sequences are assigned into two parts: the first
part consists of 3’ 10-mer of vi (unless v0, where 20-mer are
assigned), whereas the second part contains 5’ 10-mer of Vj (unless
v6, where 20-mer are used). In this case, oligos encoding eij, is
denoted as Oij. Based on the example problem in Figure 1.16, this
method of encoding is depicted as in Figure 1.17. Note that all the
DNA sequences are written from 5’ to 3’.
To implement Step 1 of the algorithm, all the oligos
representing the edges, Oij and complementary oligos of vertices,
Oi are poured in a single test tube. After that, hybridization and
ligation reaction are applied to the mixture, result in the formation
of DNA molecules encoding a lot of random paths of the graph,
which is based on Watson Crick complementarity. Figure 1.18
shows an example how those formations can be formed after the
reaction of hybridization. Figure 1.19, on the other hand, shows
several possible formations, after ligation reaction is accomplished.
Introduction to deoxyribonucleic acid structure, bio-molecular 19
operators,and DNA computing 
 

Figure 1.16: (a) A directed graph for Hamiltonian path problem (b) The
answer of Hamiltonian path problem.

O4

O34

O3
O3

O23

O2

Figure 1.17: Encoding method based on Adleman DNA computing.


20   Progress in computation intelligence in vitro and in silico 

Figure 1.18: Hybridization and ligation in Adleman DNA computing.


O01 O15 O56

O0 O1 O5 O6

O34 O41 O12 O23

O4 O1 O2

O03 O34 O45 O56

O0 O3 O4 O5 O6

O01 O12 O23 O34 O45 O56

O0 O1 O2 O3 O4 O5 O6

Figure 1.19: Examples showing several formations of candidate paths.

Step 2 is implemented whereby the product of Step 1 is


subjected to polymerase chain reaction (PCR). PCR is done by
using two primers, which are O0 and O6 . As a result, all the
formations, those begin with from V0 and end with V6, will be
exponentially amplified. The product of Step 2 is separated in term
of length by gel electrophoresis. According to the gel image, the
double stranded DNAs (dsDNAs) of 140 base-pair (bp), which
represents the formation of path, which begins at V0 and end at V6,
and pass through the other five vertices, are excised and extracted
from the gel as shown in Figure 1.20. The extracted DNA
molecules are again amplified by PCR.
Introduction to deoxyribonucleic acid structure, bio-molecular 21
operators,and DNA computing 
 
Step 4 can be implemented by affinity-purify of the product
of Step 3 with a biotin-avidin magnetic beads system for n times,
where each time, the DNA molecules that contain subsequence Oi
are able to be selected and separated from the solution. Figure 1.21
shows a magnetic bead separation for selecting and separating the
DNA molecules containing the subsequence O2. The product of the
magnetic bead separation is the formation of DNA molecules
representing a path that enter every vertex at least once. Lastly, the
last step can be made with the use of 260 nm ultra violet (UV)
source in order to check whether there are DNA molecules
survived in the test tube after Step 1 to Step 4 are accomplished.
The answer of the HPP is ‘YES’ if any DNA molecules remain,
otherwise, ‘NO’.

Figure 1.20: Selection by gel electrophoresis.


22   Progress in computation intelligence in vitro and in silico 

Figure 1.21: Extraction by magnetic bead separation.

1.4 SUMMARY

The explanation in this chapter has stressed on the basic structure


of DNA, which covers nucleotides, nucleobases, single stranded
structure, double stranded structure, and Watson-Crick base
pairing. Basic biotechnology is outlined as well. Also, this chapter
presented the beginning of DNA computing, where 7 nodes of
HPP is taken as an example problem.
Introduction to deoxyribonucleic acid structure, bio-molecular 23
operators,and DNA computing 
 
1.5 REFERENCES

[1] L. Stryer, W.H. Freeman and Company, Biochemistry, 4th


edition, 1995.
[2] J.D. Watson and F.H.C. Crick, “A structure for
deoxryribose nucleic acid”, Nature, 1953.
[3] K. van Holde, “Principles of Biophysical Chemistry”,
Prentice-Hall, 1998.
[4] http://members. aol.com/Cappuccinno21/Biochem/DNAPic
s/helix.g.
[5] F. Ausubel and K. Struhl, “Short protocol in molecular
biology: A compendium of methods from current protocols
in molecular biology” , Wiley & Sons , Third Edition, 1995.
[6] http://web.siumed.edu/~bbartholomew/images/chapter5/F05-
14.jpg.
[7] C.S. Calude and G. Paun, “Computing with cells and atoms
- An introduction to quantum, DNA, and membrane
computing”, Taylor & Francis Inc., New York, 2001.
[8] M. Zucca, “DNA based computational models”, PhD
Thesis, Politecnico Di Torino, Italy, 2000.
[9] B. Albert, A. Johnson, J. Lewis, M. Raff, K. Roberts, and
P. Walter, “Molecular biology of the cell”, Garland
Science, 4th Edition, 2002.
[10] J.P. Fitch, “An engineering introduction to biotechnology”,
SPIE, 2002.
[11] G. Paun, G. Rozenberg, and A. Salooma, “DNA
computing: new computing paradigms”, Springer, New
York, 1998.
[12] M. Amos, “DNA computation”, PhD Thesis, The
University of Warwick, UK, 1997

View publication stats

You might also like