You are on page 1of 15

B.

TECH SEMINAR REPORT


on DNA COMPUTING
Submitted to
GOVERNMENT COLLEGE OF ENGINEERING, JALGAON
425002
(An Autonomous Institute of Government of Maharashtra and affiliated To Kavayitri
Bahinabai Chaudhari North Maharashtra University, Jalgaon)

Submitted By
ABHISHEK YOGIRAJ SHIVGAN
(1811001)

Course Guide
Shri. S. C. Kulkarni

DEPARTMENT OF
ELECTRONICS AND TELECOMMUNICATION ENGINEERING
GOVERNMENT COLLEGE OF ENGINEERING, JALGAON 425002
DEC 2021
GOVERNMENT COLLEGE OF ENGINEERING, JALGAON 425002
(An Autonomous Institute of Government of Maharashtra and affiliated To Kavayitri
Bahinabai Chaudhari North Maharashtra University, Jalgaon)
Department of Electronics and Telecommunication Engineering

CERTIFICATE

This is to certify that ET407U SEMINAR REPORT on DNA COMPUTING, which is being
submitted herewith for the partial completion of Bachelor of technology completed by ABHISHEK
YOGIRAJ SHIVGAN under my supervision and guidance. With the declaration of the student, the
work embodied in this Seminar Report has contributed to the best of my knowledge and belief.

(Shri S. C. Kulkarni) (Dr. D. S. Chaudhari)


Guide Head of E & Tc Department

(Dr. G. M. Malwatkar)
Principal
DECLARATION

I hereby declare that ET407U SEMINAR REPORT on DNA COMPUTING was performed and
written by me under the guidance of Shri. S. C. Kulkarni at Government College of Engineering,
Jalgaon. This work has not been previously formed the basis for the award of any degree or diploma
or certificate nor has been submitted elsewhere for the award of any degree or diploma.

Place: Jalgaon
Date:
ABHISHEK YOGIRAJ SHIVGAN
PRN: 1811001
Final Year B. Tech E&Tc
ACKNOWLEDGMENT

It is indeed a great pleasure and proud privilege for us to present this report; first and
foremost. We are thankful to principal of our college Dr. G. M. Malwatkar for having taken
interest in all activities related to studies. We would like to express sincere gratitude towards our
HoD Dr. D. S. Chaudhari and guide Shri. S. C. Kulkarni Sir for being guiding force behind all
our efforts and their assistance during the seminar. It is indeed a great pleasure and proud privilege
for us to present this report, first and foremost,we again express our sincere thanks to the staff of
Electronics and Telecommunication Engineering Department,for their co-operation and
suggestions during this mini project and report preparation.

Abhishek Yogiraj Shivgan


PRN: 1811001
Final Year B. Tech (E&Tc)
ABSTRACT

DNA computing is an area of natural computing based on the idea that molecular
biology processes can be used to perform arithmetic and logic operations on
information encoded as DNA strands. The first part of this review outlines basic
molecular biology notions necessary for understanding DNA computing, recounts the
first experimental demonstration of DNA computing (Adleman’s 7-vertex
Hamiltonian Path Problem), and recounts the milestone wet laboratory experiment
that first demonstrated the potential of DNA computing to outperform the
computational ability of an unaided human (20 variable instance of 3-SAT).

The second part of the review describes how the properties of DNA-based
information, and in particular the Watson–Crick complementarity of DNA single
strands, have influenced areas of theoretical computer science such as formal
language theory, coding theory, automata theory and combinatorics on words. More
precisely, we describe the problem of DNA encodings design, present an analysis of
intramolecular bonds, define and characterize languages that avoid certain
undesirable intermolecular bonds, and investigate languages whose words avoid even
imperfect bindings between their constituent strands. We also present another,
vectorial, representation of DNA strands, and two computational models based on
this representation: sticker systems and Watson–Crick automata. Lastly, we describe
the influence that properties of DNA-based information have had on research in
combinatorics on words, by enumerating several natural generalizations of classical
concepts of combinatorics of words: pseudopalindromes, pseudoperiodicity, Watson–
Crick conjugate and commutative words, involutively bordered words, pseudoknot
bordered words. In addition, we outline natural extensions in this context of two of
the most fundamental results in combinatorics of words, namely Fine and Wilf's
theorem and the Lyndon–Schutzenberger result.
Chapter 1
INTRODUCTION

Every single cell which builds up a living organism carries information for various
functions necessary for the survival of the cell. This genetic information in each cell
is stored in molecules called nucleic acids. The most stable form of nucleic acids is
called deoxyribonucleic acid(DNA). Each of the DNA strands forms helical
structures that are long polymers of millions of linked nucleotides. These nucleotides
consist of one of four nitrogen bases, a five-carbon sugar, and a phosphate group. The
nitrogen bases - A (Adenine), T (Thymine), G (Guanine), C (Cytosine) encodes the
genetic information while the others provide structural stability. The strands are
linked to each other by the base-pairing rule, T with A and C with G. The
arrangement of these bases is important as they decide the functionality of different
genes.

What is DNA Computing

DNA computing is an area of natural computing based on the concept of performing


logical and arithmetic operations using molecular properties of DNA by replacing
traditional carbon/silicon chips with biochips. This allows massively parallel
computation, where complex mathematical equations or problems can be solved at a
much less time. Hence with a considerable amount of self-replicating DNA,
computation is much efficient than the traditional computer which would require a lot
more hardware. A good experience with biology and computer science is required to
build algorithms to be executed in DNA computing. The information or data instead
of being stored in binary digits will now be stored in the form of the bases A, T, G, C.
The ability to synthesize short sequences of DNA artificially makes it possible to use
these sequences as inputs for algorithms.
Chapter 3
LITERATURE SURVEY
Overview:
The concept of DNA computing was introduced in 1994 by USC professor, Leonard
Adleman, in the November 1994 Science article, Molecular Computations of
Solutions to Combinatorial Problems. Adleman showed that DNA could be used to
store data and even perform computations in a massively parallel fashion. Using the
four bases of DNA (adenine, thymine, cytosine, and guanine), Adleman encoded a
classic “hard” problem (one that exhibits exponential growth with each additional
input parameter) known as the Traveling Salesman Problem into strands of DNA and
utilized biological properties of DNA to find the answer. Adleman stumbled upon the
idea of DNA computing when he noticed how DNA replication was remarkably
similar to an early theoretical computer developed in the 1930’s by Alan Turing.
During replication, DNA polymerase slides along a single DNA strand, reading each
base and writing its complement on the new strand, while in one version of the
Turing Machine, a mechanism moved along a pair tapes, reading instructions from an
“input tape” and writing out the result on the “output tape”. Interestingly, Alan
Turing’s simple machine was proven to have the same computing capability as any
modern computer. Adleman now began to wonder: if Turing’s simple machine has
such great computational ability, would similarly operating DNA also have the ability
to do computations? It did, as Adleman’s first experiment proved. Although his
experiment involved large amounts of slow, manual labor to separate out the correct
answers, included a high chance of error, and was unscalable for larger problems,
DNA computing promised immensely high density storage, unparalleled energy
efficiency, and a level of parallelism unknown to digital computers. A new field was
born.
Chapter 4
DNA COMPUTING
What is DNA Computing?
DNA computing is a modern area of science that recognizes biomolecules as
fundamental elements of electronic devices. This is related to several other areas
including chemistry, software engineering, cell genetics, physics, and mathematics.
Computing with biological molecules, rather than conventional silicon chips. While
its conceptual history stretches back to the early 1950s, the principle of computing
with molecules was only understood scientifically in 1994, when Leonard Adleman
illustrated the answer of a small aspect of a very well-known problem in
combinatorics utilizing standard molecular biology methods in the lab. Since this
study, curiosity in DNA computing has significantly increased, and now it’s a best-
established research field. Leonard Adleman demonstrated how a statistical problem
can be solved with molecules.

Figure 1. DNA Computing

How will the computer with DNA work?


Scientists have discovered a new material to create the next generation of computer
chips they require. Hundreds of thousands of natural powerful computers, like your
body, reside within living things. DNA (deoxyribonucleic acid) molecules, the
component from which our genes are produced, can generate data several times
greater than the most efficient human-built computing devices … The reason for this
enthusiasm was that DNA molecules are inexpensive, fairly simple to manufacture,
and versatile. There’s no limitation to the capacity that DNA computation can
potentially have as power is increased more and more compounds you add to the
equation but unlike silicon transistors that can conduct a single rational item at a time,
these DNA structures can potentially conduct as much of the operations at a time as
possible to resolve an issue and do everything all at once …… For a long period of
time scientists have been aware that DNA could be used to store data. Compared to
standard devices DNA computers take a radically different approach to fixing the
issues. “Modern electronic processors have to choose a route to follow when they
come to a T-junction, while a DNA computer doesn’t have to decide because it
reproduces itself through follow both directions at the same moment. Complex
mathematical problems have been already solved with DNA molecules. Though still
in its development, DNA computers will be able to hold billions of times greater data
than your personal computer. The researchers use genetic material to create nano-
computers that could take the place of machines based on silicon over the next
decade.

Figure 2. The work of DNA Computing

The experiments and Success of DNA Computers.


Scientists have created a new form of DNA device that operates in living organisms,
potentially opening the way for a separate system that can pick out infected cells
from some very healthy cells. The machine operates on a mechanism called RNA
interference (RNAi), one where small RNA molecules inhibit the development of
protein by a gene. The Adleman DNA computer ‘s achievement is evidence that
DNA could be used to analyse complex math equations. This initial DNA machine is
very far from intimidating computers built on silicone in terms of efficiency,
nevertheless. Computer scientists at Davis and Caltech University of California have
formulated DNA molecules that can be self-assembled into frameworks by using six-
bit inputs to effectively run their own programme. Microsoft also has a programming
language for DNA computing which will help make DNA computing functional once
bio-processor technology is progressing to the stage that it can operate more
sophisticated algorithms. In addition, Microsoft plans to incorporate DNA computing
into its cloud services by 2020, and to aggressively build DNA storage space to
incorporate into its cloud computing. The Adleman DNA machine very easily
produced a collection of potential responses, but it required Adleman days to limit
down the options. The objective of the area of DNA computing is to establish a
system that can operate independently of human intervention. It will take several
years for the DNA computer components to evolve logic gates and biochips into a
functional, feasible DNA device. Scientists believe that if such a computer is ever
designed it would be more lightweight, reliable and powerful than today’s computers.
Advantages of DNA based Computer.
The process starts with the use of DNA by allocating DNA strands to cities on a chart
and to linkages among cities. The city strands are to link with the interactions and
shape strands consisting of paths via the various cities. The strands are then arranged
in such a way that only the proper number of cities are related. Carry out hundreds of
thousands of tasks at the same time. DNA computers’ massively parallel
computational power can give them the ability to seek solvable approaches to
otherwise difficult issues, and possibly speed up massive, but otherwise possible to
solve, polynomial-time problems commonly used to increase few procedures. There
is “still the risk that a few of the strands would include the same city twice,” so the
DNA is carried through filtration; each filter collects only DNA comprising a certain
segment (each section representing a region). A further advantage of this method to
DNA is that it operates in “parallel,” concurrently analysing all potential scenarios. It
thus enables large parallel investigations to be conducted and a full set of possible
solutions to be produced. DNA can contain more information than a trillion CDs in a
cubic centimetre, thereby allowing it to accommodate vast quantities of working
memory adequately. The DNA machine also has very low power consumption, and if
it is mounted within the cell this would not need much energy to work and its energy-
efficiency is much more than a thousand times that of a PC. While still in its infancy,
machines with DNA are able to store billions of times more data than a personal
computer. The DNA strands that the filters survive reflect all alternative avenues
through the cities.

The future of DNA computing.


The interaction of chemists, biologists, mathematicians and softwar engineers to
recognize and model essential biological processes and algorithms occurring inside
cells makes DNA computing so fascinating. CONVENTIONAL machines perform
linear calculations (that is, assume tasks one at a time). Nevertheless, parallel DNA
computational power could tackle math problems in hours which would take many
years for electronic computers. The DNA machine is in the very initial phases of
development, however, there are actually many areas in active usage (or underactive
technologies). Technically, classical DNA computing techniques have already been
applied in real-life problems: breaking the Data Encryption Standard, DES. While
this task has already been resolved using traditional techniques in a far shorter time
than proposed by DNA methods, the DNA models are far more versatile, efficient,
and cost-effective. A machine composed of DNA and enzymes was developed by
Israeli researchers.
Chapter 5

DNA computers vs. Silicon-based computers

DNA is often thought of as the "software" of life.

When talking about deoxyribonucleic acid -- DNA, the molecule that carries the
genetic information of life -- scientists often make comparisons to computer systems,
with DNA being an enormous "program" to be run by the body's hardware. But
significant differences exist between the genetic code of DNA and the binary code
used by computers, and each system has its advantages and limitations.

Counting Digits
The simplest unit of binary code is the binary digit, or "bit," which can have one of
two values: 0 or 1. The simplest unit of DNA, on the other hand, is the nucleotide,
which can have one of four bases: adenine, cytosine, thymine or guanine (A, C, T or
G). This increased variation means that each nucleotide of DNA can hold twice as
much information as each digit of a binary program.

Byte Sizes
Computers and biological systems both read their respective codes in blocks of
several units instead of analysing each bit or nucleotide individually. Binary
information is grouped into sets of eight bits, called bytes; each byte thus has one of
256 possible configurations of zeros and ones. Genetic information instead comes in
triplets of nucleotides known as codons, which represent different amino acids,
meaning that each DNA "byte" has only 64 possibilities.

Starting and Stopping


Both binary and genetic codes contain signals that indicate where to begin and end
the reading of their messages. Computers use start and stop bits for this purpose,
while the genetic code contains one start codon and three stop codons. However,
DNA often exhibits greater flexibility in starting and stopping, as certain parts of the
genetic code can be read in different, overlapping segments. These different
interpretations are called open reading frames, and often each frame codes for an
entirely different but still useful final product.

Protecting Data
In digital code, a single inaccurate bit causes its byte to have a different value, which
can introduce significant errors to a computer program. DNA is considerably more
resilient in comparison, as many nucleotide changes do not result in changes to the
value of -- the amino acid coded by -- a codon. Although 64 codons are possible,
biological machinery uses only 20 amino acids in the construction of proteins. Many
codons that differ by one nucleotide therefore code for the same amino acid, a
property known as redundancy. Redundancy protects genetic data from some
inevitable errors that occur in the replication and reading of DNA.

How Much Information Does DNA Encode?


The simplest answer to “How much information does DNA encode?” is “enough data
to completely specify an organism’s particular genome and epigenome.” That
involves the number of base pairs and the number of possible sites for adding a
suppressor. Human DNA has approximately 3 billion base pairs, according to the
National Human Genome Research Institute. That means 4^3,000,000,000 possible
base sequences.
For simplicity, let’s say that each gene is either suppressed, or not, in the epigenome.
That would be a binary choice for each gene. Most humans have between 20,000 and
25,000 genes. Let’s say the average is about 2^22,500 more choices.The length of
DNA varies for different species. Humans, with about 3 billion base pairs, have
neither the largest nor smallest genome.Normally we specify the “amount of
information” in bits; so 2^n choices requires n bits. Note that 4^j = (22)^j = 2^(2j).
Therefore, human DNA genome encodes 4^(3 billion) = 2^(6 billion) choices, or 6
billion bits of information. The epigenome encodes at least 2^22,500 choices, or
22,500 bits. The total information is 6,000,022,500 bits, or approximately 6 Gb
(gigabits).We usually discuss computer storage in bytes rather than bits. 6 Gb
would amount to 6/7 = 0.857 GB (gigabytes), or 857 MB (megabytes), using
ASCII code.
How much information the amino acids encode?

One might suggest that the genetic information is equally carried by the amino acids
produced by the codons. (This still assumes that “junk” DNA also carries exactly that
information). There are 21 possible results from each codon. The one “start” codon
encodes one amino acid; 60 different codons encode another 19 amino acids; and
three codons encode “stop”. The 3 billion base pairs would be grouped into 1 billion
codons, and each codon has 21 possible meanings. So that would be 21^(1 billion)
sequences of amino acids.
We need to convert 21^(1 billion) to a power of two, since all the other information
results are in bits. The conversion factor is ln(21)/ln(2), where “ln” is the natural
logarithm function. We have ln(21)/ln(2) = 3.0445/0.6931 = 4.3923 (rounded),
according to my calculator. (1 billion) * 4.3923 = 4,392,300,000 bits of information
to code amino acids. So that is a total information of 4,392,322,500 bits including the
epigenome. In ASCII code, that would be 627,474,642 MB (megabytes).
Comparing the Genetic Code to Computer Data Storage
Let’s conclude by comparing computer data storage to the genetic code for DNA.
Computers store data in two-valued bits, grouped as bytes of 7 or more bits (for
ASCII). One byte holds 2^7=128 unique values.
DNA stores data in four-valued base pairs, which RNA then groups as codons of 3
pairs. One codon holds 4^3=2^6=64 unique values. A sequence of base pairs that
convey biological information is called a gene. DNA includes extra information to
express or suppress specific genes. Each gene has at least one bit of information for
expression or suppression.
Computer files may be measured in megabytes or gigabytes: millions or billions of
bytes. One CD-ROM disc may store about 710 MB. Modern solid-state memory and
disk drives can store gigabytes. If we can fully prescribe one human’s DNA by
specifying the full sequence of base pairs, plus a binary flag to express or suppress
each gene, then human DNA contains about 6 Gb or 857 MB of information.
Conclusion

 DNA computers show enormous potential, especially for medical purposes as


well as data processing applications.

 Many issues to overcome to produce a useful DNA computers.

 Still a lot of work and resources required to develop it into a fully fledged
product.

 Miniaturization of data storage.

 Massive amout of working memory.

You might also like