Professional Documents
Culture Documents
Fast Algorithms For The Constrained Longest Increasing Subsequence Problems
Fast Algorithms For The Constrained Longest Increasing Subsequence Problems
-226-
The 25th Workshop on Combinatorial Mathematics and Computation Theory
il and ai1 < ai2 < . . . < ail . The RLIS and SLIS in the subsequence. In this section, we define that
problems are formally defined as follows. ai ≺R aj if and only if LV ≤ aj − ai ≤ UV and
LI ≤ j − i ≤ UI .
The RLIS Problem:
Input: a sequence of comparable elements
2.1 The Algorithm
a1 , a2 , . . . , an and 0 < LI ≤ UI < n,
0 ≤ LV ≤ UV
The Rank of element ai , denoted by Ranki , is
Output: an RLIS ai1 , ai2 , . . . , ail , which is a
defined as the length of an RLIS ending at ai and
maximum-length increasing subse-
computed by the following lemma.
quence satisfying LI ≤ ik+1 − ik ≤
UI and LV ≤ aik+1 − aik ≤ UV for all Lemma 1: If aj ⊀R ai for all j, let Ranki = 1,
1≤k<l and max1≤j<i {Rankj |aj ≺R ai } + 1 otherwise.
An element can be plotted on a 2-D plane us-
Our algorithm utilizes a data structure called
ing its index as the x-ordinate and value as the
the dynamic RMQ [17] described as follows. Let
y-ordinate. Consequently, the sequence A would
S be a set of items which is initially an empty set
become n points on a 2-D plane and any pair of
and maintained by an AVL tree [1]. Each item
points makes a slope.
(v, r) of S includes two given values v and r, and
The SLIS Problem: can be inserted to or deleted from S by the key v
Input: a sequence of comparable elements at any time. For any pair (p, q), the maximum r
a1 , a2 , . . . , an and a nonnegative slope among the items in S whose v value is within the
boundary m range [p, q] can be queried by this data structure.
Output: an SLIS ai1 , ai2 , . . . , ail , which is a All operations run in O(log |S|) time, where |S| is
maximum-length increasing subse- the cardinality of S at the operation time.
quence such that the slope between In the RLIS problem, let v and r be the value
two consecutive points is no less and Rank of ai , respectively. In other words, each
ai −aik
than the input ratio, i.e., ik+1 k+1 −ik
≥ item in S has the form (ai , Ranki ). We define
m for all 1 ≤ k < l three operations specific to this problem, in which
all of them run in O(log |S|) time by the dynamic
Notice that the inputs of the RLIS and SLIS RMQ datastructure.
problems are constrained on 0 < LI ≤ UI < n,
0 ≤ LV ≤ UV , and m ≥ 0, respectively. If LV = 0 1. Insert(ai , Ranki ) — inserts the item
or m = 0, the output subsequence would be a non- (ai , Ranki ) into S.
decreasing subsequence. Moreover, our algorithm
for the RLIS problem still works if LV < 0, and 2. Delete(ai ) — deletes the item (ai , Ranki )
the output would be another related subsequence from S.
where any two consecutive elements may decrease 3. Query(p, q) — returns the item (ai , Ranki )
by at most −LV . with maximum Rank among all elements in
The rest of this paper is organized as follows. S, where p ≤ ai ≤ q.
Section 2 gives an O(n log(UI −LI )) time and O(n)
space algorithm for the RLIS problem. In Section The strategy of this algorithm is using the dy-
3, an O(n log r) time and O(n) space algorithm is namic programming to parse each element ai from
proposed for solving the SLIS problem, where r i = 1 to n. At the ith iteration, S maintains the
is the length of an SLIS. Concluding remarks are set of items (ak , Rankk ) where the index k is in
given in Section 4. the range [i − UI , i − LI ], and Ranki is computed.
Then, any element in S dominates ai if and only if
its value is in the range [ai − UV , ai − LV ]. Query-
2 The RLIS Problem ing in the dynamic RMQ datastructure with the
range [ai − UV , ai − LV ] obtains the element which
The input of the RLIS problem contains an ar- dominates ai with maximum Rank.
ray A = a1 , a2 , . . . , an , 0 < LI ≤ UI < n, and The pseudo code for the RLIS problem is given
0 ≤ LV ≤ UV . For any two elements ai and aj in Figure 1. At the first step of for loop, in or-
in A, ai is said to dominate aj , or said to be a der to fit in with the constraint of LI and UI , it
dominating element of aj , denoted by ai ≺ aj , if adjusts the set S by deleting ai−UI −1 and insert-
and only if aj can be the successive element of ai ing (ai−LI , Ranki−LI ). Secondly, querying the
-227-
The 25th Workshop on Combinatorial Mathematics and Computation Theory
-228-
The 25th Workshop on Combinatorial Mathematics and Computation Theory
this algorithm is also using dynamic programming For any element a ∈ Rk , there must exist an
to parse each element ai from i = 1 to n, and element a ∈ Rk−1 such that a ≺S a. At the ith
Ranki is computed at the ith iteration. At the iteration, a ≺S ai implies that a ≺S ai by the
ith iteration, let Rk be any partition of the ele- property of transitivity. By Lemma 5, we know
ment set {a1 , a2 , . . . , ai−1 }. An element c ∈ Rk is that the critical point c in Rk−1 dominates ai , i.e.,
called a critical point in Rk if an element in Rk c ≺S ai . Thus, the following lemma holds.
dominates ai implies that c also dominates ai .
When we want to find out if there exists any Lemma 6: For any element a ∈ Rk at the ith
dominating element in a partition Rk , it is suffi- iteration, if a ≺S ai , then the critical point of
cient to check only the critical point in Rk . We Rk−1 also dominates ai .
shall prove that for any partition Rk , the last in-
By the property of transitivity and Lemma 6,
serted element is the critical point. Thus, only the
we have Corollary 7.
element with maximum index in each partition Rk
should be recorded in our algorithm. Corollary 7: For any element a ∈ Rk at the ith
At the ith iteration of the algorithm, suppose iteration, if a ≺S ai , then for all critical points in
that the critical point which dominates ai with Rj with j < i dominates ai .
maximum Rank is in the partition of Rk . We then
have Ranki = k + 1, thus ai is inserted into Rk+1 . By Lemma 5, it is obvious to know that Al-
Similarly, if there does not exist any dominating gorithm SLIS implements the Lemma 6. By
element of ai , then ai ∈ R1 and Ranki = 1. the Corollary 7, performing a binary search takes
We shall also show that for any element a ∈ Rk O(log |R|) time for each element, where |R| is the
at the ith iteration, if a ≺S ai , then we have number of Rank partitions and does not greater
c ≺S ai for all critical points c in Rj with j < k. than the output size r. It takes O(n) space
By this property, the algorithm uses binary search to record all the critical points and backtracking
to find the dominating critical point with maxi- links. Therefore, we have the following theorem.
mum Rank. The remaining steps of the algorithm
such as making backtracking links and outputting Theorem 8 : Algorithm SLIS delivers an SLIS
length and sequence are the same with Section 2. of an input sequence in O(n log r) time and O(n)
space, where n is the input size and r is the output
3.2 Correctness and Efficiency size.
-229-
The 25th Workshop on Combinatorial Mathematics and Computation Theory
RLIS problem in O(n log(UI −LI )) time and O(n) [8] A.L. Delcher, S. Kasif, R.D. Fleischmann, J.
space, where UI and LI are the upper and lower Peterson, O. white, and S.L. Salzberg. Align-
bounds of differences on indices, respectively. An- ment of whole genomes. Nucleic Acids Res.,
other algorithm presented in this paper utilizes the 27(11):2369–2376, 1999.
concept of “critical point” to solve the SLIS prob-
lem in O(n log r) time and O(n) space, where r is [9] A.L. Delcher, A. Phillippy, J. Carlton, and
the output length. S.L. Salzberg. Fast algorithms for large-scale
genome alignment and comparison. Nucleic
Acknowledgments. We thank our advisor Prof. Acids Res., 30(11):2478–2483, 2002.
Kun-Mao Chao for valuable comments. I-Hsuan
Yang and Yi-Ching Chen were supported in part [10] L. Florea, G. Hartzell, Z. Zhang, G.M. Rubin,
by NSC grants 95-2221-E-002078, 95-2221-E-002- and W. Miller. A computer program for align-
126-MY3, and 96-2221-E-002-034 from the Na- ing a cDNA sequence with a genomic DNA
tional Science Council, Taiwan. sequence. Genome Res., 8(9), 967–974, 1998.
[1] G.M. Adelson-Velskil and E.M. Landis. So- [12] I. Katriel and M. Kutz. A faster algorithm
viet Math. (Dokl.) 3, 1259–1263, 1962. for computing a longest common increasing
subsequence. Research Report MPI-I-2005-
[2] M.H. Albert, A. Golynski, A.M. Hamel, A.
1-007, Max-Planck-Institut für Informatik,
López-Ortiz, S.S. Rao, and M.A. Safari.
Stuhlsatzenhausweg 85, 66123 Saarbrücken,
Longest increasing subsequences in sliding
German, March 2005.
windows. Theor. Comput. Sci., 321:405–414,
2004. [13] D.E. Knuth. The Art of Computer Program-
[3] D. Aldous and P. Diaconis. Longest increas- ming. Vol 3, Sorting and searching. Addison-
ing subsequences: From patience sorting to Wesley, 1998.
the Baik-Deift-Johansson theorem. Bulletin
[14] S. Kurtz, A. Phillippy, A.L. Delcher, M.
(New Series) of the American Mathematical
Smoot, M. Shumway, C. Antonescu, and
Society, 36(4):413–432, 1999.
S.L. Salzberg. Versatile and open software for
[4] S. Bespamyatnikh and M. Segal. Enumerat- comparing large genomes. Genome Biology.,
ing longest increasing subsequences and pa- 5:R12, 2004.
tience sorting. Inf. Process. Lett., 76:7–11,
2000. [15] U. Manber. Introduction to algorithms – A
creative approach. Addison-Wesly, 1989.
[5] G.S. Brodal, K. Kaligosi, I. Katriel, and
M. Kutz. Faster algorithms for comput- [16] C. Schensted. Longest increasing and decreas-
ing longest common increasing subsequences. ing subsequences. Canad. J. Math., 13:179–
BRICS-RS-05-37, December 2005. 191, 1961.
[6] W.T. Chan, Y. Zang, S.P.Y Fung, D. Ye, [17] T. Shibuya and I. Kurochkin. Match chain-
and H. Zhu. Efficient algorithms for finding ing algorithms for cDNA mapping. In Proc.
a longest common increasing subsequence. In 3rd International Workshop on Algorithms in
Proc. 16th Annual International Symposium Bioinformatics (WABI 2003), 2812:462–475,
on Algorithms and Computation (ISAAC 2003.
2005), 665–674, 2005.
[18] T.F. Smith and M.S. Waterman. Comparison
[7] E. Chen, H. Yuan, and L. Yang. Longest of biosequences. Adv. in Appl. Math., 2:482–
increasing subsequences in windows based 489, 1981.
on canonical antichain partition. In Proc.
16th Annual International Symposium on Al- [19] P. van Emde Boas. Preserving order in a for-
gorithms and Computation (ISAAC 2005), est in less than logarithmic time and linear
1153–1162, 2005. space. Inf. Process. Lett., 6(3):80–82, 1977.
-230-
The 25th Workshop on Combinatorial Mathematics and Computation Theory
-231-