Introduction To Deoxyribonucleic Acid Structure, Bio-Molecular Operators, and Dna Computing

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/315698215
INTRODUCTION TO DEOXYRIBONUCLEIC ACID

STRUCTURE, BIO-MOLECULAR OPERATORS, AND DNA
COMPUTING
Chapter · January 2008
CITATIONS READS
0 988
1 author:
Zuwairie Ibrahim
Universiti Malaysia Pahang
318 PUBLICATIONS 1,700 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Optimization View project
Neurosignals classification via time sereies analysis View project
All content following this page was uploaded by Zuwairie Ibrahim on 30 March 2017.
The user has requested enhancement of the downloaded file.

Introduction to deoxyribonucleic acid structure, bio-molecular 1
operators,and DNA computing

1
INTRODUCTION TO
DEOXYRIBONUCLEIC ACID
STRUCTURE, BIO-MOLECULAR
OPERATORS, AND DNA COMPUTING
Zuwairie Ibrahim
1.1 INTRODUCTION

In this chapter, the basic structure of Deoxyribonucleic Acid
(DNA) is presented and described starting from nucleotides, which
is the monomer of the DNA, to the famous helical structure of the
DNA. In order to understand how to make DNA performs the
computation, it is required to study several biochemical reactions
such as DNA hybridization and denaturation, ligation,
polymerization, polymerase chain reaction (PCR), and magnetic
bead separation. Hence, these biomolecular operators, which are
often employed for DNA manipulation in DNA computing, are
also presented. Next, a novel in vitro approach, which has been
proposed by Adleman for solving Hamiltonian Path Problem
(HPP), is presented and some implementation issues of Adleman
DNA computer are discussed.
2 Progress in computation intelligence in vitro and in silico
1.2 BASIC DNA STRUCTURE

1.2.1 NUCLEOTIDES
DNA is a polymer, which is strung together from a series of
monomers. Monomers, which form the building blocks of nucleic
acids, are called nucleotides. Each nucleotide consists of sugar
ring, a phosphate, and a nucleobase, as shown in Figure 1.1. Sugar
ring is a 5-member of carbon of a nucleotide. Each carbon of
nucleotide is named as 1’, 2’, 3’, 4’, and 5’. A phosphate is
attached to the 5’ carbon of nucleotide and a nucleobase is attached
to the 1’ carbon of nucleotides respectively. In addition, the 3’
carbon is attached to a hydroxyl group (HO).
There are two major classes of nucleotides: RNA
(Ribonucleic Acid) and DNA. The nucleotides are classed based
on the sugar, X, which is attached to the 2’ carbon. For the case of
RNA, the nucleotides contain a ribose sugar (X=OH) whereas for
the case of DNA, the nucleotides contains a deoxyribose sugar
(X=H). Since this research focuses on DNA as a medium of
computation, DNA will be described in more detail in the next
subchapter.
Figure 1.1: A nucleotide.


1.2.2 STANDARD NUCLEOBASE OF DNA
Nucleotides in DNA contain four types of nucleobases. These

nucleobases can be grouped into purines and pyrimidines. Purines
group contains of Adenine (A) and Guanine (G). On the other
hand, pyrimidines group contains of Thymine (T) and Cytosine
(C). Figure 1.2 (a), Figure 1.2 (b), Figure 1.2 (c), and Figure 1.2
(d) show the molecular structure of the nucleobases A, G, T, and C
respectively, and R indicates point of attachment to the 1’ carbon
of DNA [1].
Figure 1.2: (a) A nucleobase of Adenine (A) (b) A nucleobase of

Guanine (G) (c) A nucleobase of Thymine (T) (d) A nucleobase of
Cytosine (C).
1.2.3 SINGLE STRANDED STRUCTURE
Single stranded DNA (ssDNA) is a linear chain of nucleotides.

This chain, which forms a negatively charged backbone, is linked
by 5’-phosphate with 3’-hydroxyl to form a phosphodiester bond,
which is a strong covalent bond. Hence, each end of a single strand
is uniquely identified by a 3’ and 5’. Figure 1.3 shows how three
different nucleotides are linked to form a single stranded DNA [1].
By looking at the two chemically distinct ends, which is 5’
end and 3’ end, the polarity or directionality of each chain can be
established. By convention, the chain is oriented from 5’ end to 3’
end. Thus, single stranded DNA is normally written according to
the sequence of nucleobases, from 5’ to 3’. As such, a single
stranded DNA, 5’-ATCG-3’ is normally written as ATCG.
1.2.4 DOUBLE STRANDED STRUCTURE
Two respective single stranded DNA can combine with each other,
at a specific condition, to form a double stranded DNA (dsDNA).
This combination is shown conceptually in Figure 1.4 [1]. In this
figure, helical structure of double stranded DNA is omitted. This
intermolecular attachment is subjected to Watson-Crick base
pairing and oriented anti-parallel, which means that they run in
opposite directions. In other words, a 5’-3’ single stranded DNA
could combines, if possible only with a 3’-5’ single stranded DNA.
A unit of double stranded DNA behaves as a single polymer and
can be described in term of number of base pairs (bp).

5' End
O-
-
O P O
O
5' Base
H2C O
4'
H H 1'
H 3' 2'
O H
-
O P O
O
5' Base
H2C O
4' H H 1'
H 3' 2'
O H
-
O P O
O
5' Base
H2C O
4'
H H 1'
H 3' 2'
HO H
3' End
Figure 1.3: A single stranded structure of DNA.

Figure 1.4: A simplified double stranded structure of DNA.
1.2.5 WATSON-CRICK BASED PAIRING
According to the prior discussion, a single-stranded fragment has a

phospho-sugar backbone and four kinds of bases denoted by the
symbols A, T, G, and C for the bases adenine, thymine, guanine,
and cytosine respectively. These four nucleic acids, which can
occur in any order in a single stranded DNA, paired in Watson-
Crick complementarity pairs [2] to form a double strand helix of
DNA. Due to the Watson-Crick complementarity, A is paired with
T by 2 hydrogen bonds, whereas C is paired with G by 3 hydrogen

bonds, as shown in Figure 1.5 [3]. Many pairs can form a
hydrogen-bond, which is a weak bond and the point of attachment
to the backbones are equally spaced, as shown in Figure 1.5, to
allow a regular helical structure. The helical structure is shown in
Figure 1.6 [4].
1.3 REVIEW OF BASIC BIOTECHNOLOGY
There are several essential biomolecular operations, which is often

used for manipulating DNA during the computation. Those
operations will be explained in detail in this sub-chapter.
Figure 1.5: (a) G-C base pair. (b) A-T base pair.
Figure 1.6: Helical structure of DNA.

1.3.1 SYNTHESIZING DNA

Short chemically synthesized single stranded molecules are called
oligonucleotides or simply oligos. There are useful in genetic
engineering as well as in DNA computing. Due to current
technology, 70-80 sequences can be chemically synthesized
without much error. At present, it is possible to get a test tube
containing approximately 1018 DNA molecules with a desired
sequence. Some commercial DNA synthesis companies are
available, which provide a reasonable price for this reason. In
Japan, at present, the price is about ¥80/base.
1.3.2 HYBRIDIZATION AND DENATURATION
Hybridization is defined as a sequence-specific annealing of two or

more single stranded DNAs, forming a dsDNA product. This
sequence-recognition property is very useful for DNA computing
because hybridization means computation, from DNA computing
sense. This operation is normally caused by cooling down the test
tube reaction solution [5].
There are basically three cases, on how the hybridization
could occur: bi-molecular hybridization, multi-molecular
hybridization, and uni-molecular hybridization. For the first case,
bi-molecular hybridization involves two kinds of ssDNAs to form
a native double helix structure of DNA as shown in Figure 1.7 [6].
For the multi-molecular hybridization, three strands are involved
during the annealing. Multi-molecular hybridization is the essence
of Adleman DNA computing for solving an instance of
Hamiltonian path problem. Thirdly, uni-molecular hybridization or
self-hybridization, hairpin formation of ssDNAs could be formed if
a complementary portion exists in the same ssDNAs, as depicted in
Figure 1.8.

Figure 1.7: Bi-molecular hybridization and denaturation of DNA.
Figure 1.8: An example of hairpin formation of DNA.

By heating up the solution to about 85-95°C, dsDNAs will
come apart because the hydrogen bonds between complementary
nucleotides are much weaker than the covalent bonds between
nucleotides adjacent in the two strands. The separation is called
melting or denaturation. Thus, two strands can be separated
without breaking the single strands [7], as depicted in Figure 1.7.
The same effect can be achieved by washing the double stranded
DNAs in doubly distilled water.
1.3.3 LIGATION
Ligation is often invoked after the single DNA strands are

annealed according to the Watson-Crick complementarity. Many
single-strand fragments will be connected in series and ligase is
used as ‘glue’ to seal the covalent bonds between the adjacent
fragments [8]. Figure 1.9 shows the principle of ligation. In this
figure, three kinds of ssDNAs, namely strand A, strand B, and
strand C, taking part during the ligation. Strand A and strand B
must be located adjacently with each other without gap, and
hybridized partially with strand C. The product of ligation is a
‘new’ strand AB. Ligation is generally implemented in laboratory
via a DNA ligase, such as T4 DNA ligase. Moreover, strand A
must have a 5’ PO4 and energy required during the ligation is ATP
or NAD+.
1.3.4 CUTTING DNA BY RESTRICTION ENZYME
It is possible to cut a double-stranded DNA by restriction

endonucleases enzyme. This operation is depends on the sequences
of dsDNAs. Four common cut-sites or restriction sites are shown
in Figure 1.10 [9]. This operation often form a ‘sticky ends’, which
may useful for directing later annealing/ligation in DNA
computing. There are two types of sequence-specific
endonucleases: Type I and Type II. Type I cut at the restriction
site, as shown in Figure 1.10 whereas Type II cut away from
restriction site. The restriction site is symmetry, usually 6-bps in
length and the enzyme cuts both of the backbones symmetrically.

Figure 1.9: Ligation.
1.3.5 POLYMERIZATION
The substrates required for polymerization are a template strand to

be copied, a primer strand to be 3’-extended, incoming dNTP
monomers, which act as both base and energy sources, and DNA
polymerase. DNA polymerase implements a 5’ to 3’ copying
operation as depicted in Figure 1.11 [9]. During the copying
operation, 3’ end of a primer strand is extended. Note that there is
no 3’ to 5’ copying operation ever observed. This operation also
depends on Watson-Crick complementarity. In other words, A is
copied to T and G is copied to C, and so on.
Figure 1.10: Four types of common restriction sites of endonucleases.
1.3.6 POLYMERASE CHAIN REACTION (PCR)
PCR is an incredible sensitive copying machine for DNA. It also

can be used for DNA detection. Given a site-specific single
molecule DNA, a million or even billion of similar molecules can
be created by PCR process. In n steps, it can produce 2n copies of
the same molecules. PCR needs a number of sub-sequence strands
called ‘primers’, which are usually about 20 base long to signal a
specific start and end site at a template for replication. PCR
normally runs for 20-30 cycles of 3 phases: separating base pair
strands of DNA at about 95°C, annealing at 55°C, and extension at
74°C [10]. It takes about two to three hours normally in order to

complete the cycles. Figure 1.12 shows the operations of
polymerase chain reaction up to third cycles.
Figure 1.11: Polymerization in action.

dsDNA to be amplified
separate DNA strands

+
anneal primers
DNA primers
DNA synthesis (primer extension)
FIRST CYCLE: producing 2 dsDNAs
separate DNA strands

+
anneal primers
DNA synthesis (primer extension)
SECOND CYCLE: producing 4 dsDNAs
THIRD CYCLE: producing 8 dsDNAs
Figure 1.12: Polymerase chain reaction.
1.3.7 GEL ELECTROPHORESIS
DNA strands in a solution can be separated in terms of its length

by means of gel electrophoresis. In fact, the molecules are
separated according to their weight, which is almost proportional
to their length [7]. This technique relies on the fact that DNA

molecules are negatively charged [11]. Hence, by putting them in
an electric field, they will move towards the positive electrode at
different speed. If electrical field is applied through the gel, longer
molecules will remain behind the shorter ones, as shown in Figure
1.13 [12]. The speed of DNA mixture in a gel depends heavily on
the gel porosity and the magnitude of the electrical field.
Polyacrylamide gel is used for separation of shorter dsDNAs,
which is from 10 bps until 500 bps. On the other hand, agarose gel
is frequently used for longer dsDNAs, which is more than 500-bps.
An example of the output of gel electrophoresis is well depicted in
Figure 1.14. In DNA computing, this technique is used to visualize
the results of computation. Normally, at the end of this process, the
gel is photographed for convenience.
Figure 1.13: Polyacrylamide gel electrophoresis.

Figure 1.14: Example of a gel image.
1.3.8 DNA EXTRACTION
By exploiting the specificity of hybridization, ssDNAs can also be

segregated by sequence. Let say, one have a DNA mixture T, the
objective of this operation is to remove the subset Ts of strands in T
containing the subsequence S. Figure 1.15 shows an example of
DNA extraction. Based on this example, S = AGCATA. Before the
extraction, biotinylized strand, F with S*, where * denotes Watson-
Crick complementation, is conjugated to streptavidin-coated
magnetic beads. Next, strand F is mixed with the mixture T in
order to allow strands F to hybridize to strands in T containing S.
After the hybridization, the strands F can be removed
magnetically, from the DNA mixture T. At the same time, the
subset of T, which is hybridized with S*, will also be removed from
the DNA mixture T. Lastly, the strand Ts can be recovered by
melting or washing the strand F.

Figure 1.15: An example of DNA extraction by using streptavidin-

coated magnetic bead.
1.4 THE BEGINNING OF DNA COMPUTING

1.4.1 DNA COMPUTING FOR HAMILTONIAN PATH
PROBLEM
A non-deterministic algorithm for solving directed HPP is as

follows:
Step 1: Generate all the paths randomly in large quantity.
Step 2: Reject all paths that do not begin with vin and end in vout.
Step 3: Reject all paths that do not involve exactly n vertices.
Step 4: For each of the n vertices v, reject all paths that do not
involve v.
Step 5: The answer is ‘YES’ if any path remains, otherwise ‘NO’.
Adleman proved that this algorithm can be implemented in

molecular level. At first, each vertex is denoted by a single
stranded sequence of nucleotides of length 20. These codes with
length 20 are enough in order to ensure that the codes are
“sufficiently different”. A set of 20-mer oligonucleotides, or
oligos, for short, which has been used by Adleman, to encode each
vertex, is randomly designed in advance. Oligos encoding vi is
denoted as Oi, while the Watson Crick complementarity of Oi is
denoted as Oi . As for edges, eij, connecting two vertices, from Vi
to Vj, 20-mer DNA sequences are assigned into two parts: the first
part consists of 3’ 10-mer of vi (unless v0, where 20-mer are
assigned), whereas the second part contains 5’ 10-mer of Vj (unless
v6, where 20-mer are used). In this case, oligos encoding eij, is
denoted as Oij. Based on the example problem in Figure 1.16, this
method of encoding is depicted as in Figure 1.17. Note that all the
DNA sequences are written from 5’ to 3’.
To implement Step 1 of the algorithm, all the oligos
representing the edges, Oij and complementary oligos of vertices,
Oi are poured in a single test tube. After that, hybridization and
ligation reaction are applied to the mixture, result in the formation
of DNA molecules encoding a lot of random paths of the graph,
which is based on Watson Crick complementarity. Figure 1.18
shows an example how those formations can be formed after the
reaction of hybridization. Figure 1.19, on the other hand, shows
several possible formations, after ligation reaction is accomplished.

Figure 1.16: (a) A directed graph for Hamiltonian path problem (b) The
answer of Hamiltonian path problem.
O4
O34
O3
O3
O23
O2
Figure 1.17: Encoding method based on Adleman DNA computing.

Figure 1.18: Hybridization and ligation in Adleman DNA computing.

O01 O15 O56
O0 O1 O5 O6
O34 O41 O12 O23
O4 O1 O2
O03 O34 O45 O56
O0 O3 O4 O5 O6
O01 O12 O23 O34 O45 O56
O0 O1 O2 O3 O4 O5 O6
Figure 1.19: Examples showing several formations of candidate paths.
Step 2 is implemented whereby the product of Step 1 is

subjected to polymerase chain reaction (PCR). PCR is done by
using two primers, which are O0 and O6 . As a result, all the
formations, those begin with from V0 and end with V6, will be
exponentially amplified. The product of Step 2 is separated in term
of length by gel electrophoresis. According to the gel image, the
double stranded DNAs (dsDNAs) of 140 base-pair (bp), which
represents the formation of path, which begins at V0 and end at V6,
and pass through the other five vertices, are excised and extracted
from the gel as shown in Figure 1.20. The extracted DNA
molecules are again amplified by PCR.

Step 4 can be implemented by affinity-purify of the product
of Step 3 with a biotin-avidin magnetic beads system for n times,
where each time, the DNA molecules that contain subsequence Oi
are able to be selected and separated from the solution. Figure 1.21
shows a magnetic bead separation for selecting and separating the
DNA molecules containing the subsequence O2. The product of the
magnetic bead separation is the formation of DNA molecules
representing a path that enter every vertex at least once. Lastly, the
last step can be made with the use of 260 nm ultra violet (UV)
source in order to check whether there are DNA molecules
survived in the test tube after Step 1 to Step 4 are accomplished.
The answer of the HPP is ‘YES’ if any DNA molecules remain,
otherwise, ‘NO’.
Figure 1.20: Selection by gel electrophoresis.

Figure 1.21: Extraction by magnetic bead separation.
1.4 SUMMARY
The explanation in this chapter has stressed on the basic structure

of DNA, which covers nucleotides, nucleobases, single stranded
structure, double stranded structure, and Watson-Crick base
pairing. Basic biotechnology is outlined as well. Also, this chapter
presented the beginning of DNA computing, where 7 nodes of
HPP is taken as an example problem.

1.5 REFERENCES
[1] L. Stryer, W.H. Freeman and Company, Biochemistry, 4th

edition, 1995.
[2] J.D. Watson and F.H.C. Crick, “A structure for
deoxryribose nucleic acid”, Nature, 1953.
[3] K. van Holde, “Principles of Biophysical Chemistry”,
Prentice-Hall, 1998.
[4] http://members. aol.com/Cappuccinno21/Biochem/DNAPic
s/helix.g.
[5] F. Ausubel and K. Struhl, “Short protocol in molecular
biology: A compendium of methods from current protocols
in molecular biology” , Wiley & Sons , Third Edition, 1995.
[6] http://web.siumed.edu/~bbartholomew/images/chapter5/F05-
14.jpg.
[7] C.S. Calude and G. Paun, “Computing with cells and atoms
- An introduction to quantum, DNA, and membrane
computing”, Taylor & Francis Inc., New York, 2001.
[8] M. Zucca, “DNA based computational models”, PhD
Thesis, Politecnico Di Torino, Italy, 2000.
[9] B. Albert, A. Johnson, J. Lewis, M. Raff, K. Roberts, and
P. Walter, “Molecular biology of the cell”, Garland
Science, 4th Edition, 2002.
[10] J.P. Fitch, “An engineering introduction to biotechnology”,
SPIE, 2002.
[11] G. Paun, G. Rozenberg, and A. Salooma, “DNA
computing: new computing paradigms”, Springer, New
York, 1998.
[12] M. Amos, “DNA computation”, PhD Thesis, The
University of Warwick, UK, 1997
View publication stats

Introduction To Deoxyribonucleic Acid Structure, Bio-Molecular Operators, and Dna Computing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Deoxyribonucleic Acid Structure, Bio-Molecular Operators, and Dna Computing

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

INTRODUCTION TO DEOXYRIBONUCLEIC ACID

Chapter · January 2008

Optimization View project

Neurosignals classification via time sereies analysis View project

The user has requested enhancement of the downloaded file.

1.2 BASIC DNA STRUCTURE

Figure 1.1: A nucleotide.

Nucleotides in DNA contain four types of nucleobases. These

Figure 1.2: (a) A nucleobase of Adenine (A) (b) A nucleobase of

1.2.3 SINGLE STRANDED STRUCTURE

Single stranded DNA (ssDNA) is a linear chain of nucleotides.

1.2.4 DOUBLE STRANDED STRUCTURE

Figure 1.3: A single stranded structure of DNA.

Figure 1.4: A simplified double stranded structure of DNA.

1.2.5 WATSON-CRICK BASED PAIRING

According to the prior discussion, a single-stranded fragment has a

1.3 REVIEW OF BASIC BIOTECHNOLOGY

There are several essential biomolecular operations, which is often

Figure 1.6: Helical structure of DNA.

1.3.1 SYNTHESIZING DNA

1.3.2 HYBRIDIZATION AND DENATURATION

Hybridization is defined as a sequence-specific annealing of two or

Figure 1.7: Bi-molecular hybridization and denaturation of DNA.

Figure 1.8: An example of hairpin formation of DNA.

Ligation is often invoked after the single DNA strands are

1.3.4 CUTTING DNA BY RESTRICTION ENZYME

It is possible to cut a double-stranded DNA by restriction

Figure 1.9: Ligation.

The substrates required for polymerization are a template strand to

Figure 1.10: Four types of common restriction sites of endonucleases.

1.3.6 POLYMERASE CHAIN REACTION (PCR)

PCR is an incredible sensitive copying machine for DNA. It also

Figure 1.11: Polymerization in action.

separate DNA strands

DNA synthesis (primer extension)

FIRST CYCLE: producing 2 dsDNAs

separate DNA strands

DNA synthesis (primer extension)

SECOND CYCLE: producing 4 dsDNAs

THIRD CYCLE: producing 8 dsDNAs

Figure 1.12: Polymerase chain reaction.

1.3.7 GEL ELECTROPHORESIS

DNA strands in a solution can be separated in terms of its length

Figure 1.13: Polyacrylamide gel electrophoresis.

Figure 1.14: Example of a gel image.

1.3.8 DNA EXTRACTION

By exploiting the specificity of hybridization, ssDNAs can also be

Figure 1.15: An example of DNA extraction by using streptavidin-

1.4 THE BEGINNING OF DNA COMPUTING

A non-deterministic algorithm for solving directed HPP is as

Adleman proved that this algorithm can be implemented in

Figure 1.17: Encoding method based on Adleman DNA computing.

Figure 1.18: Hybridization and ligation in Adleman DNA computing.

O34 O41 O12 O23

O03 O34 O45 O56

O01 O12 O23 O34 O45 O56

Figure 1.19: Examples showing several formations of candidate paths.

Step 2 is implemented whereby the product of Step 1 is

Figure 1.20: Selection by gel electrophoresis.

Figure 1.21: Extraction by magnetic bead separation.

The explanation in this chapter has stressed on the basic structure

[1] L. Stryer, W.H. Freeman and Company, Biochemistry, 4th