Journal of Computer Applications (JCA) ISSN: 0974-1925, Volume VI, Issue 3, 2013

DNA Encryption Mechanism for Data Security Over Transmission
S.Deepika a,* Abstract - Computer!
As the name suggests, computes complex mathematical operations and performs the desired operations. DNA computing is a new method of simulating bio-molecular DNA and computing by means of molecular technical computation. It is a new method of liquid computation by harnessing the enormous parallel computing ability and high memory density of bio-molecules, which brings potential challenges and opportunities to traditional cryptography. DNA cryptography is a new field of research in the recent years which encodes the data to be transmitted in a more secure fashion by using DNA CRYPTOGRAPHIC ALGORITHM [1] over the underlying INTERNATIONAL DATA ENCRYPTION ALGORITHM (IDEA) [9]. The cipher (Unreadable) text produced will be in DNA sequences making it more reliable towards security over transmission of the data. Thus providing an effective transaction of confidential data in military purpose, in cloud computing and in many other applications though it requires high tech lab constrains and implementation. Index Terms – CIPHER, ENCRYPTION, IDEA. DNA CRYPTOGRAPHY,

Within the cells of any organism is a substance called Deoxyribonucleic Acid (DNA) fig1.1. which is a double-stranded helix of nucleotides which carries the genetic information of a cell. This information is the code used within cells to form proteins and is the building block upon which life is formed. Strands of DNA are long polymers of millions of linked nucleotides. The nucleotides that make up these polymers are named after the nitrogen base that it consists of; Adenine (A), Cytosine (C), Guanine (G) and Thymine (T). These nucleotides will only combine in such a way that C always pairs with G and T always pairs with A. The two strands of a DNA molecule are anti-parallel where each strand runs in an opposite direction. The combination of these 4 nucleotides in the estimated million long polymer strands can result in billions of combinations within a single DNA double-helix.

I. INTRODUCTION The word computer conjures up images of keyboards, monitors and terms like RAM, ROM, gigabite and megahertz come to mind. But what if computers were ubiquitous and could be in many forms other than using electronic components on a silicon substrate? Could a liquid computer exist in which interacting molecules perform the computations? The answer is yes. This is the story of DNA Computer that which is not more complex than the superlative computer-the human brain. Leonard Adelman-the father of DNA computing, made a pioneering experiment in the field of molecular biology that opened the possibility that moderately a large instances of NP-complete problems can be solved using DNA. A. What is DNA? Before delving into the principles of DNA computing, we must have a basic understanding of what DNA actually is. All organisms on this planet are made of the same type of genetic blueprint which bind us together. The way in which that blueprint is coded is the deciding factor as to whether you will be bald, have a bulbous nose, male, female or even whether you will be a human or an oak tree.
Manuscript received 18/September/2013.

Figure 1.1Complimentary DNA strands With the advances in DNA research in projects such as the Human Genome project and a host of others, the mystery of DNA and its construction is slowly being unraveled through mathematical means. Distinct formulae and patterns have emerged that may have implications well beyond those found in the fields of genetics. What does all this chemistry and biology have to do with security you might ask? To answer that question we must first look at how biological

S.Deepika, UG Scholar, Department of Computer Science and
Engineering, Vel Tech High Tech Dr.Rr Dr.Sr Engineering College, Avadi, Tamil Nadu, India. (E-mail :


DNA Encryption Mechanism for Data Security Over Transmission

science can be applied to mathematical computation in a field known as DNA computing. B. Basics of DNA Computing: DNA computing [5] or molecular computing are terms used to describe utilizing the inherent combinational properties of DNA for massively parallel computation. The idea is that with an appropriate setup and enough DNA, one can potentially solve huge mathematical problems by parallel search. Basically this means that you can attempt every solution to a given problem until you came across the right one through random calculation. Utilizing DNA for this type of computation can be much faster than utilizing a conventional computer, for which massive parallelism would require large amounts of hardware, not simply more DNA. „DNA-based Cryptography‟ which puts an argument forward that the high level computational ability and incredibly compact information storage media of DNA computing has the possibility of DNA based cryptography based on one time pads. They argue that current practical applications of cryptographic systems based on one-time pads is limited to the confines of conventional electronic media whereas as small amount of DNA can suffice for a huge one time pad for use in public key infrastructure. II. EXISTING SYSTEM Internatonal Data Encryption Algorithm (IDEA): The International Data Encryption Algorithm IDEA [9], is one of the strongest cryptographic algorithms. Although it‟s quite strong but not that popular as DES and AES because it is patented and it does not have good track record available. It is a block cipher which works on 64 bit plaintext blocks with 128 bit key to encode the data. It is also a reversible algorithm that is same algorithm can be used for encryption and decryption process. The 64-bit plaintext block is partitioned into four 16-bit sub-blocks six 16-bit key are generated from the 128-bit key. Since a further four 16-bit key sub-blocks are required for the subsequent output transformation, a total of 52 (= 8 x 6 + 4) different 16-bit sub-blocks have to be generated from the 128-bit key. First, the 128-bit key is partitioned into eight 16-bit sub-blocks which are then directly used as the first eight key sub-blocks. The 128-bit key is then cyclically shifted to the left by 25 positions, after which the resulting 128-bit block is again partitioned into eight 16-bit sub-blocks to be directly used as the next eight key sub-blocks. The cyclic shift procedure described above is repeated until all of the required 52 16-bit key sub-blocks have been generated. The first four 16-bit key sub-blocks are combined with two of the 16-bit plaintext blocks using addition modulo 216, and with the other two plaintext blocks using multiplication modulo 216+1. At the end of the first encryption round four 16-bit values are produced which are used as input to the second encryption round .The process is repeated in each of the subsequent 7 encryption rounds. The four 16-bit values produced at the end of the 8th encryption round are combined with the last four of the 52 key sub-blocks using

addition modulo 216 and multiplication modulo 216 + 1 to form the resulting four 16-bit cipher text blocks . The computational process used for decryption of the cipher text is essentially the same as that used for encryption. The only difference is that each of the 52 16-bit key sub-blocks used for decryption is the inverse of the key sub-block used during encryption. In addition, the key sub blocks must be used in the reverse order during decryption in order to reverse the encryption process. IDEA relies on extension of key to be used for each round using circular left shift operations. And then getting sub keys from the new key generated at each round. III. PROPOSED SYSTEM A. DNA ENCRYPTION: DNA encryption is a next generation security mechanism, storing almost a million gigabytes of data inside bacteria. Research from two prominent universities indicates that it is not only possible but also practical to store digital data in the genome of a living organism and retrieve that data hundreds or even thousands of years later, after the organism has reproduced its genetic material through hundreds of generations. Note: A milliliter of liquid can contain up to 1 billion bacteria, and you can see that the potential capacity of bacteria-based memory is enormous. Even very simple bacteria have long strands of DNA with tons of bases available for data encryption, and bacteria are by their nature far more resilient to damage than more traditional electronic storage. Bacteria are nature's hardiest survivors, capable of surviving just about any disaster that would finish off a regular hard drive. Besides, bacteria's natural reproduction would create lots of redundant copies of the data, which would help preserve the integrity of the information and make retrieval easier. Preparing traditional data for storage inside bacteria is simple enough. There are four DNA bases that can be used to make up the DNA strings: adenine, cytosine, guanine, and thymine. That basically means we're working with a four number system, also known as quaternary numbers. In a presentation on their breakthrough, the Hong Kong researchers showed how to change the word "iGEM" into DNA-ready code. They used the ASCII table to convert each of the individual letters into a numerical value (i=105, G=71, etc.), which can then be changed from base-10 to base-4 (105=1221, 71=0113, etc.). Finally, those numbers can be changed into their DNA base equivalents, with 0, 1, 2, and 3 replaced with A, T, C, and G. And so “iGEM” becomes ATCTATTGATTTATGT. But DNA strands aren't long enough to store complicated information like a photograph or a book, so the best available solution is to fragment the data into lots of little pieces and spread it among the different cells. To make that work, the researchers have to create a system that allows the fragments to identified and ultimately put back in the right order. So they created a three-part structure for all the DNA: header, message, and checksum. The header is an 8-base-long sequence that is divided into four levels of identifying information - zone, region, area 86

Journal of Computer Applications (JCA) ISSN: 0974-1925, Volume VI, Issue 3, 2013 and district - which allows each fragment to be put back in the right order. After the message carries the actual usable data, the checksum provides a repetition of the original header, which is useful in controlling for minor mutations to the bacteria. So, let's say the information has been encrypted and placed in lots of different cells of bacteria. How then does someone retrieve the data on the other end? The decrypter would take the DNA and run it through what's known as next-generation high-throughput sequencing, or NGS [3]. This particular type of sequencing analyzes and compares multiple copies of the same sequence and then uses majority-voting to figure out which bases are correct if parts of the data have decayed. Then the compression algorithms could be reversed to restore the raw data to its original form. The last step would be snapping the fragments back together in the correct order so that the DNA strands could be translated back into useful data. This is where we go from just data storage to data encryption. The person trying to read the data needs a formula that will reveal the right order of the headers and checksums - without that formula, the data remains meaningless. B. DNA COMPUITNG IN CRYPTOGRAPHY: Here we have integrated DNA computing in well known IDEA algorithm to make it more efficient and effective from cryptanalytic point of view fig 1.2. That is it becomes more immune to general attacks which a cryptographic system encounters in day to day scenarios. Many researchers have integrated new mechanisms in IDEA like chaotic series, modular arithmetic and VLSI implementations etc. C. Steps: 1. Encryption process at sender side: Enter text to be encrypted. Apply IDEA to the text entered. Encrypt DNA encrypted text with IDEA encryption. Convert the 64 bit cipher text obtained into DNA. Sequence using look up table. Send DNA sequence obtained as cipher. 2. Decryption process at the receiver side: DNA sequence obtained is used to get usual cipher for IDEA decryption algorithm. IDEA decrypts the message using initial cipher obtained through DNA sequence. Message recovered from DNA decryption algorithm is further decrypted by IDEA. Original plaintext is recovered.

11010101 11000000 10101011 00110111 00001111 11100110 10111110 Cipher IDEA = DNA cipher = GGGCTACCCTTCGCGTTATTACCCC TCGGGTTCCT AAAGCCCGGAGTA Recovered text IDEA = 234 213 192 171 55 15 230 190 Plaintext recovered = „Confidential Information‟
Fig 1.2 Encryption and Decryption process

Perhaps this all appears a bit farfetched at first. The „test tube‟ environment used in this type of cryptography is far from practical for everyday use. IV. FUTURE SCOPE DNA computing finds a very enormous application in finding all possible solutions for a particular operation. Thus it can be used in networking to compute the shortest paths with lower costs for transmission of data. It can even be implemented in providing security in the firewalls of cloud computing. Also finds a major application in data security over military transactions as well. If confidential data of military applications are to be secretly and securely sent, DNA encryption plays a vital role. The DNA computers are unlikely to feature word processing, e-mailing and solitaire programs. Instead, their powerful computing power will be used by national governments for cracking secret codes, or by airlines wanting to map more efficient routes. Studying DNA computers may also lead us to a better understanding of a more complex computer - the human brain. V. CONCLUSION DNA cryptography is the future of the information security. Its complexity and randomness provides a great uncertainty which makes encoding of data in DNA format better than other mechanism of cryptography. And on integrating it with a well known symmetric cryptographic mechanism that is IDEA makes it very difficult to decode the data without precise knowledge of the key. The field of DNA computing is still in its infancy and the applications for this technology are still not fully understood. The world of information security is always on the lookout for unbreakable encryption to protect the data that we transmit but it appears that every encryption 87

Enter text to be encrypted : „Confidential Information‟ Cipher = 234 213 192 171 55 15 230 190 Plaintext =11101010

DNA Encryption Mechanism for Data Security Over Transmission

technology meets its endgame as the computing technology of our world evolves. It appears we are involved in a paradox where the best encryption technology of the day is only as good as the computing power that it is tested upon and the practicality of its application. Is DNA computing viable – perhaps, but the obstacles that face the field such as the extrapolation and practical computational environments required are daunting. REFERENCES
[1] Gehani, Ashish. La Bean, Thomas H. Reif, John H. “DNA-Based Cryptography”. Department of Computer Science, Duke University. June 1999, [2] “Sci/Tech DNA hides spy message”. June 10, 1999. [3]Pelletier, Oliver. “Algorithmic Self-Assembly of DNA Tiles and its Application of Cryptanalysis”. October 2, 2001. [4] Taggart, Stewart. “Call it the SyDNA Olympics”. March 7, 2000.,1282,34774,00.html [5] Gupta, Gaurav. Mehra, Nipun. Chakraverty, Sh umpa. “DNA Computing”. The Indian Programmer. June 12, 2001. [6] Peterson, Ivars. “Hiding in DNA”. Science News Online. April 8, 2000. [7] Blahere, Kristina. “DNA Computing”. CNET. April 26, 2000 [8] Taylor Clelland, Catherine. Risca, Viviana. Bancroft, Carter. “Hiding Messages in DNA Micodots”. Nature Magazine Vol 399. June 10, 1999. &CC=EP&NR=0482154 [9] Daemen, Joan; Govaerts, Rene; Vandewalle, Joos (1993), "Weak Keys for IDEA", Advances in Cryptology, CRYPTO 93 Proceedings : 224–231

Deepika.S and I am pursuing my final year( IV year) in B.E. Computer Science and Engineering in Vel Tech High tech Dr.RR Dr.SR Engineering College.