Attribution Non-Commercial (BY-NC)

3 views

Attribution Non-Commercial (BY-NC)

- class10_Bash Shell scripting.pdf
- Verilog_ Finite State Machines
- Unix Regular Expressions
- L05.pdf
- Sequence & Series Theory_e
- ch01
- How to Regex
- Static flow analysis of programs
- B. TECH.(Computer Engineering) Part -III(Semester v & VI) (Batch 2010)
- Ahuja, Network Flows
- 21 FSM Examples
- LearningAutomata.pdf
- cse5
- MIT6_01SCS11_lec02
- Ruby01
- PmodMic
- Sequences - Solutions
- Lexical Analysis Sample
- QTP Framework Description Web v1
- Algorithm

You are on page 1of 4

CSCI-589

Abstract. Finite Automata Based Algorithms for the Generalized Constrained Longest Common Subsequence[i] solves the following problems: STR-IC-LCS Problem, SEQ-IC-LCS Problem, STREC-LCS Problem and SEQ-EC-LCS Problem. For the generalized constrained longest common subsequence (GC-LCS) for strings S1 , S2 with respect to P, the time complexity of the solutions are worked out to be (r (n+ m)+ nm) for a fixed size alphabet, where r , n and m are the lengths of P, S1 and S2 respectively. The problems and solutions can be extended to Regular Expression Constrained Longest Common Subsequence. I presented the finite automata based algorithm can be applied to Regular Expression Constrained Longest Common Subsequence problem.

Finite Automata Based Algorithms for the Generalized Constrained Longest Common Subsequence[i] solves the following problems: STR-IC-LCS Problem, SEQ-IC-LCS Problem, STR-EC-LCS Problem and SEQ-EC-LCS Problem. For the generalized constrained longest common subsequence (GC-LCS) for strings S1 , S2 with respect to P, the time complexity of the solutions are worked out to be (r (n+ m)+ nm) for a fixed size alphabet, where r , n and m are the lengths of P, S1 and S2 respectively. One of the closest problem to the above is the Sequence alignment problem. This is presented and a solution is given in [ii]. The Regular Expression Constrained Sequence Alignment problem is introduced and a solution is presented by A. N. Arslan[ii]. The problem is introduced as: given strings S1 , S2 and a regular expression R, find the maximum alignment score between S1 and S2 over all alignments such that in these alignments there exists a segment where some substring s1 of S1 is aligned to some substring s2 of S2 , and both s1 and s2 match a given regular expression R, i.e. s1 , s2 L(R) where L(R) is the regular language described by R. The solution for this problem in [ii] presented a (nmr ) time algorithm where r =(t 4 ) , and t is the number of states of a nondeterministic finite automaton N that accepts L(R). I have tried to present a Finite automata based algorithm for the Regular Expression constrained Sequence Alignment problem in this paper.

The algorithm to find an LCS of S1 and S2 constrained by a regular expression R as a substring is presented below: Step 1. Construct the minimal Automata MR for R such that L(MR) = L(R). The method used to construct the minimal automata from a regular expression is presented in [iii]. The time complexity is given as (r log( r )) where r is the size of the regular expression R. The space complexity, i.e. the number of states in the resulting minimal automata is (r ) . Step 2. Construct the Subsequence Automata M1 for S1. We can construct a Directed Acyclic Sequence Graph (DASG) for the string S1 in (nlog(n)) time and (n) space where n is the size of string S1 as given in [iv]. L(M1) = All possible 2 n subsequences of S1. Step 3. Construct the Subsequence Automata M2 for S2. As in step 2, we can construct a Directed Acyclic Sequence Graph (DASG) for the string S2 in (mlog(m)) time and (m) space where m is the size of string S2 as given in [iv]. L(M2)

= All possible 2 m subsequences of S2. Step 4. Construct the Intersection Automata M1R of M1 and MR. L( M 1R )=L ( M 1) L( M R ) contains all the subsequences of S1 which satisfy the regular expression constraint R. This step consumes (nr ) time and at most (nr ) space. Step 5. Construct the Intersection Automata M2R of M2 and MR. L( M 2R )= L( M 2 )L (M R) contains all the subsequences of S2 which satisfy the regular expression constraint R. This step consumes (mr ) time and at most (mr ) space. Step 6. Construct the Intersection Automata M12R of M1R and M2R. L( M 12R )=L ( M 1R ) L( M 2R ) contains all the common subsequences of S1 and S2 which satisfy the regular expression constraint R. This step consumes (nmr 2 ) time and at most (nmr 2 ) space. Step 7. Find the maximum value path for Alignment Score in M12R. Following the Dijsktra's algorithm for finding the maximum paths using the weights on the edges from the transition function (x y) of the edit operation x y, we can find the maximum alignment (longest common subsequence with maximum alignment score). The simplest implementation of the Dijkstra's algorithm backed up with a binary heap takes ((nmr 2)+ log(nmr 2)) time. The total solution is bounded by step 7 consuming ((nmr 2)+ log(nmr 2)) time.

i Effat Farhana, Jannatul Ferdous, Tanaeem Moosa and M. Sohel Rahman, Finite Automata Based Algorithms for the Generalized Constrained Longest Common Subsequence Problems ii A. N. Arslan, Regular Expression Constrained Sequence Alignment, Journal of Discrete Algorithms iii Sanjay Bhargava, G. N. Purohit, Construction of a Minimal Deterministic Finite Automaton from a Regular Expression iv Ricardo A. Baeza Yates, Searching Subsequences

- class10_Bash Shell scripting.pdfUploaded byPavan Behara
- Verilog_ Finite State MachinesUploaded bySharan Chaitanya
- Unix Regular ExpressionsUploaded bysam2976
- L05.pdfUploaded byNithin Kumar
- Sequence & Series Theory_eUploaded bythinkiit
- ch01Uploaded byRaymond Given Chikororo
- How to RegexUploaded bygalici2002
- Static flow analysis of programsUploaded byasdfghqwer
- B. TECH.(Computer Engineering) Part -III(Semester v & VI) (Batch 2010)Uploaded byRajan Goyal
- Ahuja, Network FlowsUploaded byEdiz Sağlık
- 21 FSM ExamplesUploaded byJatin Nagpal
- LearningAutomata.pdfUploaded byJairo
- cse5Uploaded byVanita Mane
- MIT6_01SCS11_lec02Uploaded bycreesha
- Ruby01Uploaded byAnuj More
- PmodMicUploaded byBucur Radu
- Sequences - SolutionsUploaded bywolfretonmaths
- Lexical Analysis SampleUploaded byMajd Abu Rakty
- QTP Framework Description Web v1Uploaded byShanthling Kodli
- AlgorithmUploaded byShatadeep Banerjee
- 3-RegLangUploaded bySangeetha Somasundaram
- QVIIIUploaded bytheresa.painter
- Java ManualUploaded bySneha Tiwari
- BCSM-PotsdamUploaded byMohammed Rashid
- Gset Ast Professor Computer Science CourceUploaded bySachin Patel
- alg1m3l1- introduction to sequencesUploaded byapi-276774049
- 755 ChapterUploaded byvanit79
- Kevin Chen and Erik Winfree- Error Correction in DNA Computing: Misclassication and Strand LossUploaded byGmso3
- Lunardi Alberto - G E N S L E RUploaded byAlberto
- Zaki2000(ArticuloBD)Uploaded byJuan Perez

- 09108005Uploaded bydmugundhan
- IQ validationUploaded byJR Crusis
- interview questions with dr larsonUploaded byapi-296922357
- Soil Compaction & ConsistencyUploaded byShekh Muhsen Uddin Ahmed
- Why Atheism Will Replace Religion ReviewUploaded byadvancedatheist
- BCG a Leaders Guide Always-On TransformationUploaded bySinghShubhra
- Prologo La Diosa BlancaUploaded bypendenciera
- final draft of wall eUploaded byapi-285634356
- Toward Relational EmpowermentUploaded byjlventigan
- The De Genesi contra Manichaeos and the Origin of the Soul (1993) - Robert J. O'Connell.pdfUploaded byMetasepia
- 2012-12-13 - Bulgarian National Identity in an Era of European IntegrationUploaded byTani Andreeva
- BMC Analytics for BSM 7.6.04 Reference GuideUploaded byCecilio Gutierrez
- Toward a Contingency Theory of PlanningUploaded byAdhi Firmansyah
- 07 Splitter BladesUploaded byrewmarine
- Transit Satellite PE User ManualUploaded byStuart Robertson
- GPT-7500 / GTS-750Uploaded byTopcon Middle East
- 2773990784141084407394071655299717Uploaded byJai Deep
- ASTM_A53Uploaded byCristina Zamora
- Language of the NazisUploaded bygnagy
- Sanz AAAI SSS 2019 Consciousness and UnderstandingUploaded byRicardo Sanz
- TCC102-Unit3-tutorial3Uploaded byapi-26781128
- operational researchUploaded bystatisticianhak
- Ic401 Ppd Win Xp Vista EnUploaded byAnda
- Lit San Leandro Overview Briefing for Real Estate ProfessionalsUploaded by.xml
- Sherlock Holmes RulesUploaded bysanazh
- Homework5 (1).pdfUploaded byPaulo
- CHAPTER 1.pdfUploaded byfauzee fuad
- Statistical Analysis in JASP - A Students Guide v2.pdfUploaded byrovinj11
- Romanticism-LecturesUploaded byDemir_Alihodzic
- Academic Paper PROPOSALUploaded byReem Zaitoon