Professional Documents
Culture Documents
Ky Thuat Hash
Ky Thuat Hash
(0 ) v m Hash
ca tt c nhng tin t ca , c th l m Hash ca nhng xu [1. . ] (1 ).
pow[0] = 1
for (i : 1 .. m)
pow[i] = (pow[i-1] * 26) mod base
hashT[0] = 0
for (i : 1 .. m)
hashT[i] = (hashT[i-1] * 26 + T[i] - 'a') mod base
Trn on code trn, chng ta thu c mng [] (lu li 26
) v mng
[] (lu li m Hash ca [1. . ]).
ly m Hash ca [. . ] ta vit hm sau:
function getHashT(i, j):
return (hashT[j] - hashT[i - 1] * pow[j - i + 1] + base * base) mod base
Bi ton chnh c gii quyt, v y l chng trnh chnh:
for (i : 1 .. m - n +1)
if hashP = getHashT(i, i + n - 1):
print("Match position: ", i)
c. M chng trnh:
Chng trnh sau, ti vit bng ngn ng C++, l li gii cho bi trn h
thng chm bi trc tuyn VOJ.
#include <iostream>
#include <cstdio>
#include <cstring>
#define FOR(i,a,b) for(int i=a;i<=b;i++)
#define base 1000000003LL
#define ll long long
#define maxn 1000111
using namespace std;
ll POW[maxn],hashT[maxn];
Hash: A String Matching Algorithm
Author: Le Khac Minh Tue
4
ll getHashT(int i,int j) {
return (hashT[j]-hashT[i-1]*POW[j-i+1]+base*base)%base;
}
int main() {
string T,P;
cin >> T >> P;
int m=T.size(),n=P.size();
T=" "+T;P=" "+P;
POW[0]=1;
FOR(i,1,m) POW[i]=(POW[i-1]*26) % base;
FOR(i,1,m) hashT[i]=(hashT[i-1]*26+T[i]-'a') % base;
ll hashP=0;
FOR(i,1,n) hashP=(hashP*26+P[i]-'a') % base;
FOR(i,1,m-n+1) if(hashP==getHashT(i,i+n-1)) printf("%d ",i);
}
d. nh gi:
phc tp ca thut ton l ( +). Nhng iu quan trng l: chng ta c th kim
tra 2 xu c ging nhau hay khng trong (1). y l iu to nn s linh ng cho thut
ton Hash. Ngoi s linh ng v tc thc thi, chng ta c th thy ci t thut ton ny
thc s rt n gin nu so vi cc thut ton x l xu khc.
3. ng dng:
Nh cp trn, thut ton ny s c trng hp chy sai. Tt nhin, bn cnh vic
s dng Hash, cn c nhiu thut ton x l xu chui khc, mang li s chnh xc tuyt i.
Ti tm gi nhng thut ton l thut ton chun. Vic ci t thut ton chun c th
mang li mt tc chy chng trnh cao hn, chnh xc ca chng trnh ln hn. Tuy
nhin, ngi lm bi s phi tr gi l s phc tp khi ci t cc thut ton chun .
S dng Hash khng ch gip ngi lm bi d dng ci t hn m quan trng ch:
Hash c th lm c nhng vic m thut ton chun khng lm c. Sau y, ti s xt
mt vi v d chng minh iu ny.
a. Longest palindrome substring
Bi ton t ra nh sau: Bn c cho mt xu di ( 50 000). Bn cn tm
di ca xu i xng di nht gm cc k t lin tip trong . (Xu i xng l xu c t 2
chiu ging nhau).
Mt thut ton chun khng th p dng vo bi ton ny l thut ton KMP. Ngoi
KMP ra, c 2 thut ton chun c th p dng c. Thut ton th nht l s dng
thut ton Manachar tnh bn knh i xng ti tt c v tr trong xu. Thut ton th 2
l s dng Sufx Array v LCP (Longest Common Prefx) cho xu c ni bi v xu
Hash: A String Matching Algorithm
Author: Le Khac Minh Tue
5
vit theo th t ngc li. 2 thut ton ny u khng d dng ci t, v nm ngoi
phm vi bi vit, nn ti ch nu s qua m khng i vo chi tit.
By gi, chng ta s xt thut ton khng chun l thut ton Hash. n gin, chng
ta xt trng hp di ca xu i xng l l (trng hp chn x l hon ton tng t).
Gi s xu i xng di l di nht c di l . D thy, trong xu tn ti xu i
xng di 2, 4, Tuy nhin, xu khng tn ti xu i xng di +2, +4,
Nh vy, tha mn tnh cht chia nh phn. Chng ta s chia nh phn tm di ln
nht c th. Vi mi di , chng ta cn kim tra xem trong xu c tn ti mt xu con l
xu i xng di hay khng. lm vic ny, ta duyt qua tt c tt c cc xu con
di trong .
Bi ton cn li l: kim tra xem [. . ](1 ; ( +1) 2 = 1) c phi l
xu i xng hay khng.
Cch lm nh sau. Gi l xu vit theo th t ngc li. Bng thut ton Hash, chng
ta c th kim tra c mt xu con no ca c bng mt xu con no ca hay
khng. Nh vy, chng ta cn kim tra [. . ] c bng [ +1. . +1] hay khng vi
l tm i xng, ni cch khc =
+
2
. Nh vy bi ton c gii. phc tp cho
cch lm ny l ( log()).
b. k-th alphabetical cyclic
Bi ton t ra nh sau: Bn c cho mt dy
1
,
2
, ,
( 50 000). Sp xp hon
v vng quanh ca dy ny theo th t t in. C th, cc hon v vng quanh ca dy ny
l (
1
,
2
, ,
), (
2
,
3
, ,
,
1
), (
3
,
4
, ,
,
1
,
2
),... Dy ny c th t t in nh hn
dy kia nu s u tin khc nhau ca dy ny nh hn dy kia. Yu cu bi ton l: In ra
dy c th t t in ln th .
Nu tip cn mt cch trc tip, chng ta s sinh ra tt c cc dy hon v vng quanh,
ri sau dng mt thut ton sp xp sp xp li chng theo th t t in, cui cng
ch vic in ra dy th sau khi sp xp. Tuy nhin phc tp ca thut ton ny l rt ln
v khng th p ng c yu cu v thi gian. C th, cch ny c phc tp l (
2
log()), y l tch ca phc tp ca sp xp v phc tp ca mi php so snh dy.
Vn gi t tng l sp xp li tt c cc dy hon v vng quanh ri in ra dy ng v
tr th , chng ta c gng ci tin phc tp ca vic so snh th t t in ca 2 dy.
Nhc li nh ngha v th t t in ca 2 dy: Xt 2 dy v c cng s phn t. Gi
v tr th l v tr u tin t tri sang m
. <
<
. Nh vy, ta phi tm
on tin t ging nhau di nht ca v , ri so snh k t tip theo. tm c on
tin t ging nhau di nht, ta c th s dng Hash kt hp vi chia nh phn.
Hash: A String Matching Algorithm
Author: Le Khac Minh Tue
6
gii c bi ny, cn s dng thm mt k thut nh na: Thay v sinh ra tt c cc
hon v vng quanh, chng ta ch cn nhn i dy a ln, dy mi s c 2 phn t
(
1
,
2
, ,
,
1
,
2
, ,