Professional Documents
Culture Documents
R.C.T. Lee and Chin Lung Lu CS 5313 Algorithms for Molecular Biology
Brute-force method
m T P
Brute-force method
T P a b a c a b a b a b a b a b a b a b a b a b a b a b a b Simple, but not ecient because it cannot avoid rescanning T
By R.C.T. Lee and C.L. Lu Exact String Matching p.4
KMP algorithm
Left-to-right scan Failure function shift rule
KMP algorithm
T P shift a b a c a b a b Left-to-Right Scan a b a b mismatch occurs a b a b a b a b a b a b shift
By R.C.T. Lee and C.L. Lu
shift
a b a b
Exact String Matching p.7
Failure function
If P = p1 p2 pn , then its failure function f is dened as follows, where 1 j n. if such an largest i < j such that p1 pi = pji+1 pj , i 1 exists, f (j) = 0, otherwise. i P 1
By R.C.T. Lee and C.L. Lu
i i (j i + 1) j
Exact String Matching p.8
KMP algorithm
/* 1 2 3 4 5 6 7 8 9 T = T1 T2 Tm and P = p1 p2 pn */ i = 1; j = 1; while i m and j n do if Ti = pj then i = i + 1; j = j + 1; else if j = 1 then i = i + 1; j = 1; else j = f (j 1) + 1;
end if end while if j = n +
KMP algorithm
T abac f (j) 0 0 1 2 P abab 1 2 3 4 5 6 7 8
abab
(1, 1) (4, 4), j = f (3) + 1 = 2
No this step
(4, 2) (4, 2), j = f (1) + 1 = 1 (4, 1) (4, 1), i = 4 + 1 = 5, j = 1 (5, 1) (8, 4), i = 8 + 1, j = 4 + 1
pj
after shift
f (j) pj
f (j) pj
pj
pj
f (f (j 1))
By R.C.T. Lee and C.L. Lu
f (j 1)
Exact String Matching p.16
Boyer-Moore algorithm
Right-to-left scan Bad character shift rule Good sux shift rule
Right-to-left scan
Check whether P occurs in T at some position in the right-to-left scaning manner THE BOYERMOORE ALGORITHM Right-to-Left Scan P BUYERMOORE
x x
B(x)
t t y t
Exact String Matching p.20
y
i
B ) (x
t
Right-to-Left Scan
y
i
B(x) = 0
B ) (x
T P before shift y
x y
i
t x
Right-to-Left Scan
t x
B(x)
t x
P after shift ()
t y t
P after shift
By R.C.T. Lee and C.L. Lu
P after shift