Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Standard view
Full view
of .
Look up keyword
Like this
0 of .
Results for:
No results containing your search query
P. 1
2D Barcode for DNA Encoding

2D Barcode for DNA Encoding

|Views: 79|Likes:

More info:

Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





2D Barcode for DNA Encoding
Elena Purcaru, Cristian Toma
Bucharest General Medicine Faculty, Cybernetics and Economic Informatics Faculty 
“Carol Davila” University of Medicine and Pharmacy, Academy of Economic Studies
 Eroii Sanitari Boulevard 8, Bucharest, Romana Square 6, Bucharest ROMANIAelena.purcaru@gmail.com, cristian.toma@ie.ase.ro
The paper presents a solution for endcoding/decoding DNA information in 2D barcodes. First partfocuses on the existing techniques and symbologies in 2D barcodes field. The 2D barcode PDF417 is presentedas starting point. The adaptations and optimizations on PDF417 and on DataMatrix lead to the solution
DeoxyriboNucleic Acid Two Dimensional Barcode. The second part shows the DNA2DBCencoding/decoding process step by step. In conclusions are enumerated the most important features of 2Dbarcode implementation for DNA.
DNA - Deoxyribonucleic acid, 2D barcode, DNA2DBC, PDF417, code symbology.
1. Introduction
A barcode [6] is an opticalrepresentation of data. Originally,barcodes represented data as parallellines and the spacings, referred to aslinear or 1D (1 dimensional) barcodes orsymbologies. They also come in patternsof squares, dots, hexagons and othergeometric patterns within imagestermed 2D (2 dimensional) matrix codesor symbologies. Although 2D systemsuse symbols other than bars, they aregenerally referred to as barcodes aswell. [6]Barcodes can be read by opticalscanners called barcode readers, orscanned from an image by specialsoftware. A barcode reader contains aphoto-sensor that converts the barcodeinto an electrical signal as it movesacross it. The scanner then measuresthe relative widths of the bars andspaces, translates the different patternsback into regular characters, and sendsthem to a computer or portableterminal. 2D readers are based mostlyon camera with CMOS sensor pictureprocessing technology.Barcodes were invented to label railroadcars, but they were not commerciallysuccessful until they were used toautomate supermarket checkoutsystems, a task in which they havebecome almost universal. [6].
2. Barcodes Types
Each character in a barcode isrepresented by a pattern of wide andnarrow bars. Every barcode begins witha special start character and ends with aspecial stop character [5-6]. Theseconventions help the barcode scanner toidentify and read the symbol in the rightposition. Some barcodes may include achecksum character. A checksum iscalculated when the barcode is printedusing the characters in the barcode. Thereader performs the same calculation inorder to detect errors in the symbol. If the two checksums don't match, thereader assumes that something iswrong, throws out the data, and triesagain.
2.1. 1D Numeric andAlphanumeric Barcodes
A Barcode Symbology [5-6] defines thetechnical details of a particular type of barcode: the width of the bars, characterset, method of encoding, checksumspecifications, etc. Barcode users areusually interested in the generalcapabilities of a particular symbology(how much and what kind of data can ithold, what are its common uses, etc)and not in the technical details.
Journal of Mobile, Embedded and Distributed Systems, vol. III, no. 3, 2011
ISSN 2067
 Most used 1D numeric barcodes /symbologies are:
Codabar: used in library systems,sometimes in blood banks
Code 11: used primarily for labelingtelecommunications equipment
EAN-13: European ArticleNumbering international retailproduct code
EAN-8: compressed version of EANcode for use on small products
Industrial 2 of 5: older code not incommon use anymore
Interleaved 2 of 5: widely used inindustry, air cargo
Plessey: older code commonly usedfor retail shelf marking
MSI: variation of the Plessey codecommonly used in USA
PostNet: used by U.S. Postal Servicefor automated mail sorting
UPC-A: Universal Product Code seenon almost all retail products in theUSA and Canada
Standard 2 of 5: older code not incommon use
UPC-E: compressed version of UPCcode for use on small products.Most used 1D alphanumeric barcodes /symbologies are:
Code 128: very capable code,excellent density, high reliability; invery wide use world-wide
Code 39: general-purpose code, wideuse world-wide
Code 93: compact code similar toCode 39
LOGMARS: same as Code 39, is theU.S. Government specification
2.2. 2D Barcodes
Two dimensional
2D symbols encodedata in two dimensional shapes. Theyfall into two general categories:
Stacked barcodes, constructed like alayer of barcodes stacked on top of the other; they can be read byspecial 2D scanners or by many CCD(charge-coupled device) and laserscanners with the aid of specialdecoding software.
Matrix Codes, built on a true 2Dmatrix; they are usually morecompact than a stacked barcode, andthey can be read only by 2-Dscanners.The main advantage of 2D barcodes isthe ability to encode a lot of informationin a small space. If 1D barcodes canencode 20 to 25 characters, 2D symbolscan encode from 100 to about 2,000characters.The most used 2D barcodes /symbologies are (Classification accordingto [5]):
PDF417: used for encoding largeamounts of data
DataMatrix: can hold large amountsof data, in very small codes
Maxicode: fixed length, used byUnited Parcel Service for automatedpackage sorting
QR Code: Used for material controland order confirmation
Data Code
Code 49
3. Barcode 2D
The PDF 417 code is part of 2dimentional barcode family. PDF standsfor "Portable Document File" becausewith several rows and columns, it ispossible to encode up to 2700 bytes -
alot of PDF417 info is copyrighted in [4]
.The encoding is done in two stages:
High level encoding
The datas(input bytes) are converted to"codeword". From now on CW standsfor codeword.
High level encoding
 supports multiple modes encodingsuch as:
capacity of encodingis ASCII code 0 to 255, aprox.1,2 byte per CW. The "Byte"mode (high level encoding)allows encoding 256 differentbytes, which is the entireextended ASCII table (ISO 8859).For ASCII code values pleasereffer [8].
capacity of encodingis ASCII code 9, 10, 13 and 32 to127, 2 characters per CW
capacity of encoding is only for digits 0 to 9,2.9 digits per CW
Low level encoding
Thecodewords obtained during firststage are converted to bars andspaces patterns.Moreover an error correction systemwith several levels is included in order toallow reconstituting badly printed,erased, fuzzy or torn off datas.The general structure of PDF 417 is [4]
CW stands for codeword 
The width of the smalest/finest bar iscalled the module.
A bar module is represented by "1"and a space module by "0".
The code has 3 to 90 rows.
A row has 1 to 30 datas columns andits width goes from 90 to 583modules with the margins.
Maximum number of CW in barcodes: 928 including 925 for thedatas. (1 for the length descriptorand 2 at least for the errorcorrection.)
There are 929 CWs including 900 forthe datas, they are numbered from 0to 928.
The errors correction levels goesfrom 0 to 8. The correction covers 2(on level 0) to 512 (on level 8) CW.
The row consists of: “a startcharacter”, “a left side CW”, “1 to 30datas CW”, “a right side CW” and “astop character”. There must be a
white margin of at least 2 moduleson each side.
CW of padding (e.g. codeword withvalue "900") can be intercalatedbetween datas and correction CW;those must be located at the end.
First CW indicates CW total numberof the code including: datas, CW of stuffing and itself but excluding CWcorrection.
"Macro PDF417" mechanism allowsdistributing more datas on severalbar codes.The CW number 900-928 have specialmeaning, some enable to switchbetween modes in order to optimise thecode-Table1.
Table 1.Special CW 
CW number : Function900 : Switch to "Text" mode901 : Switch to "Byte" mode902 : Switch to "Numeric" mode903 to 912 : Reserved913 : Switch to "Octet" only for the next CW914 to 920 : Reserved921 : Initialization922 : Terminator codeword for Macro PDF control block923 : Sequence tag to identify the beginning of optional fields in the Macro PDFcontrol block924 : Switch to "Byte" mode (If the total number of byte is multiple of 6)925 : Identifier for a user defined Extended Channel Interpretation (ECI)926 : Identifier for a general purpose ECI format927 : Identifier for an ECI of a character set or code page928 : Macro marker CW to indicate the beginning of a Macro PDF Control BlockIn this section is presented the practicalencoding
of word “Super!” in high
-levelboth in text and byte mode. Thispresentation is support for our 2Dbarcode defined for encoding DNAstructure.The
high-level encoding in text mode
 has 4 sub-modes:
Mixed: Numeric and punctuation

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->