Professional Documents
Culture Documents
Bioinformatics
Sequential Pattern mining
11/21/2022
Sequence Databases & Sequential Patterns
51 length-2 <b>
<c>
<ba>
<ca>
<bb>
<cb>
<bc>
<cc>
<bd>
<cd>
<be>
<ce>
<bf>
<cf>
Candidates <d>
<e>
<da>
<ea>
<db>
<eb>
<dc>
<ec>
<dd>
<ed>
<de>
<ee>
<df>
<ef>
<f> <fa> <fb> <fc> <fd> <fe> <ff>
• Find all the length-2 seq. pat. Having prefix <a>: <aa>, <ab>, <(ab)>,
<ac>, <ad>, <af>
– Further partition into 6 subsets
SID sequence
• Having prefix <aa>; 10 <a(abc)(ac)d(cf)>
• … 20 <(ad)c(bc)(ae)>
30 <(ef)(ab)(df)cb>
• Having prefix <af>
40 <eg(af)cbc>
<aa>-proj. db … <af>-proj. db