You are on page 1of 18

Affine Gap Penalty Function

BCB 597
Fall, 2007

Multiple Base Insertion/Deletion Mutation Event .

Constant Gap Penalty • Each gap character is penalized a constant amount: γ • Not a good gap penalty for our mutation event model • What if we had a more general gap penalty function? .

j  k ]  W (k )   k 1 .0]  0 • O(n3) running time S[i.0]  W (i ) • Cannot directly apply space saving S[0. j  1]   (i  1. j ]  max  max S [i  k . j ]  W ( j ) • O(nm) space  S [i  1. j  1)  i S[i. j ]  W (k )  k 1  j  max S [i. General Gap Penalty Function • Gap penalty function: W (g ) • Each entry in the dynamic programming table now takes O(n) time to fill. S[0.

– Small g. Affine Gap Penalty • W(i)=g+hi (for i >= 1) • g: gap opening penalty • h: gap extension penalty • The ratio between g and h affects how much we view the size of the gap. Small h: size less important . Large h: size more important – Large g.

BA) = E(AA[0.L]. BA [0.i]) + E(AA[i+1.L]) . BA[i+1. Remember our “What if?” • E(AA.i].

TCT---GAGGTAGA optimal AGCTAGAGACCAGC TATCTAGAGGTAGA   1. this was a sufficient condition. h  1 However. --TCT-GAGGTAGA not optimal AGCTAGAGACCAGC TATCTAGAGGTAGA optimal AGCTAGAGACCAG. Alignment Decomposition No Longer Behaves optimal AGCTAGAGACCAG---TCT-GAGGTAGA AGCTAGAGACCAGCTATCTAGAGGTAGA optimal AGCTAGAGACCAG. g  3. not a necessary condition! .   1.

Amortize the cost of opening AGCTAAGGAA • Consider filling out the dynamic AGCTAAGGAC programming table > • The cost of g is eventually spread AGCTAAGGA- out over the length of the gap AGCTAAGGAC once the gap is closed.” AGCTAAGGACT • We cannot predict ahead to know how much the gap opening AGCTAAGGAAAA will cost per gap character. > • However. or we AGCTAAGGACTT would directly know if the gap < opening will be worth it. the one time loss might AGCTAAGGA-- eventually become “worth it. AGCTAAGGA--- AGCTAAGGACTT . • While gap opening might initially AGCTAAGGAAA cause the alignment score to be AGCTAAGGACT less than a mismatch.

under the restriction that the last character in BA be aligned with a gap. . • Table GB: Stores the best alignment score. under the restriction that the last character in AA be aligned with a gap.” • Table GA: Stores the best alignment. in case they might eventually become “worth it. we simple store the best possible alignments that end in gaps. Three Dynamic Programming Tables • Instead of looking to the future. • Table S: Stores the best alignment score with no restrictions.

j  1]  g  h S [i. Three Recursions  G A [i  1. j ] . j ]  max  G A [i. j ]   GB [i. j ]  h G A [i.0]  g  ih S [i  1. j ]  g  h GB [i. j ]  g  jh GB [i.0]    GB [i. j ]   S [i  1. j  1) S [0.0]  0  S[i. j ]  max  G A [0. j ]  max  S [i. j  1]  h S [0. j  1]   (i  1.

j].” • Paths through GA must move up • Paths through GB must move left .j] = GA[i.j] or GB[i. without making a “move. Traceback • The path can now travel • Start in table S • If S[i. jump to the corresponding table.

j ]  h G A [i. j  1]  h S [0. j  1]  g  h S [i.0]    GB [i. j ] . j ]  0 GB [i. j ]  max  S [i. j ]   GB [i. j  1) S [0. j ]   S [i  1. j ]  max  G A [i. j ]  max  G A [0.0]  0 S [i  1. Semiglobal Alignment  G A [i  1. j  1]   (i  1. j ]  g  h GB [i.0]  0  S[i.

Local Alignment • The key to Local Alignment is that the aligning region must still start and end with a matching character. • This means that a local alignment path must start and end in table S. • Thus finding local alignment is very similar to local alignment for constant gap penalty. .

j ]  g  h GB [i. j ]  GB [i. Local Alignment  G A [i  1. j ]  max  S [0. j  1]  h GB [i. j ]  max  G A [i. j ]   S [i  1.0]  0  S [0. j  1]   (i  1. j ]  max  G A [0. j ]  0 S [i. j  1)  S[i. j ]  h G A [i. j ]   0 .0]  0 S [i  1. j  1]  g  h S [i.0]    GB [i.

Applying Space Saving To Affine Gap Alignment .

j ]  h j j  G [i. j ]  G A [i  1. Finding the Intersection  S[i. j ]  g  h j j  A A max  S[i. j ) j  j 1 . j  1]   (i. j ]  S[i  1. j ]  g  h j j  G [i. j ]  h j j j  A S[i. j ]  S[i  1. j ]  S [i  1. j ]  G [i  1.

j ) … g+2h g+h g+3h g+2h g+h … g+3h g+2h g+h … g+h g+2h … .Diagonal Intersection  (i.

Vertical Intersection gh … 2h h g+3h g+2h g+h … g+3h g+2h g+h … h 2h … .