Professional Documents
Culture Documents
The scoring scheme -set of rules which assigns the alignment score/goodness of
alignment/ to any given alignment of two sequences But it does not tell us
how to find the best alignment!
--------------------
BWT-Transormation -> compression:
idea: compress RRRRBBBBTTT as 4R4B3T
BUT: There are no many clustered letters in the genome(AGCTAGCT)
=> here comes BWT
Algorithm:
Take the string and sort all circular shifts of it; BANANA: / or BANANA$ for suffix
trees/
ABANAN
ANABAN
ANANAB
BANANA
NABANA
NANABA
Third step -> sort; noticed we've got the first two columns of the initial table
AB
AN
AN
BA
NA
NA