Dynamic Programming - Longest Common Subsequence (LCS)

Programming strategies
• Dynamic programming;
This strategy is appropriate when many values are

computed many times (for example in a recursion). The
dynamic programming approach reduces the number of
times that the values are computing by storing them in a
table, ready to be looked up in case they are needed later
in the computation.
• Divide and conquer;
This strategy is appropriate when the problem may be

split into two halves, for the sake of efficiency.
Programming strategies
• Dynamic programming;
Computing the Fibonacci numbers; longest common

subsequence.
• Divide and conquer;
Binary search; fast power;...

The Fibonacci numbers In general fib(2) will need to
be computed exponentially-
int fib(int n) { many times in the
if (n <= 1) return 1; computation of fib(n).
Similarly other values of fib
else return fib(n-1) + fib(n-2);
(n) need to be computed
} many times with this
implementation.
fib(4)
fib(3) fib(2)
fib(1)
fib(2) fib(0)
fib(1)
fib(1) fib(0)
The classic application of dynamic programming to this
problem is to create a “look-up” table to store results
for later.
In this example we can do

int fast_fib(int n) { even better by saving only
if (n <= 1) return 1; the last two items, thus we
else { can do without the whole
int lookUp[n+1]; table altogether.
lookUp[0]= 1; lookUp[1]= 1;
for (int j= 2; j < n+1; j++)
lookUp[j]= lookUp[j-1] + lookUp[j-2];
}
return lookUp[n];
}
The longest common subsequence problem.
A subsequence of a string is obtained by deleting any

number of elements from any positions. A longest
common subsequence of s1 and s2 is a subsequence of
both whose length is maximal.
eg. Let s1= “abdebcbb”, s2= “adacbcb”, then

“adcbb” and “adbcb” are both longest common
subseqences.
abdebcbb adacbcb adcbb
abdebcbb adacbcb adbcb

The longest common subsequence problem is to find
the length of the longest common subsequence. (The
algorithm can be adapted to find an actual longest
common subsequence.)
The obvious “brute force” way is as follows, and is very

inefficient.
1. Compute all the substrings of s1 and s2;
2. Compare each pair of substrings.
Let L1= length(s1) and L2= length(s2).
There are 2^(L1) substrings of s1, and 2^(L2) substrings

of s2. Thus it takes at least 2^(L1+L2) comparisons.
(This is exponential in the length of the strings!)
The longest common subsequence problem.
First of all we’ll look for a different approach, and then

improve it further using dynamic programming.
Let longest(s1, s2) stand for “any longest common
subsequence of s1 and s2”. We’ll try to understand
how to analyse this for various cases.
Case 1: Either s1 or s2 is empty.
longest(s1, s2)= “” (ie it is the empty string)
Case 2: Both s1 and s2 is begin with the same letter (eg

“a”, so that s1 = a(ss1) and s2 = a(ss2).
longest(s1, s2)= a(longest(ss1,ss2) (ie, the
longest common subsequence must begin with a)
Case 3: Strings s1 and s2 begin with different letters

(eg s1= a(ss1) and s2= b(ss2), so
longest(s1, s2)= EITHER longest(ss1, b(ss2)) OR
longest(a(ss1), ss2), whichever is longest. (ie we must
cross out either a or b).
This analysis yields a recursive solution, and we note
that instead of having to look at all subsequences, we
only need to look at subsequences obtained by
removing the initial letters, but even so it will still be
very inefficient. (Why?)
Recall that we’re designing a program to compute only

the length of the longest common subsequence,
although we need the analysis for “longest” on the
previous slide even to do that!
We’ll build a table to store the results in case they are

needed again. We shall need a 2-dimensional table.
(Why?)
The LookUp table for computing lcss(s1, s2), ie the
length of the longest common subsequence uses the
following idea.
The value in LookUp[i][j]= the length of the longest

common subsequence for s1.substr(i, L1-1) and
s2.substr(j, L2-1), where L1 and L2 are the lengths of s1
and s2 respectively.
This means once we have filled the table we will be

able to find lcss(s1, s2) in LookUp[0][0]. (Why?)
We start filling the table up at indices (L1, j) and (i, L2).

(Why?)
Rules for filling up the LookUp table.
Rule 1: If either s1.substr(i, L1-1) or s2.substr(j, L2-1) is

empty, then LookUp[i][j]= 0. (When does this occur?)
Rule 2: Both s1.substr(i, L1-1) and s2.substr(j, L2-1)

begin with the same letter.
LookUp[i][j]= 1+ LookUp[i+1][j+1]
Rule 3: Substrings s1.substr(i, L1-1) and s2.substr(j, L2-1)

begin with different letters
LookUp[i][j]=
maximum(LookUp[i+1][j] AND LookUp[i][j+1])
Let s1= “abde”, s2= “ada”, and consider its Rule 1 here,
corresponding LookUp table. since at least
one of the
substrings is
“abde” “bde” “de” “e” “”
empty.
“ada” 1+ 1 max(1, 1) max(1,0) max(0, 0) 0
Rule 2 here,
“da” max(1, 1) max(0, 1) 1+ 0 max(0, 0) 0 since
substrings
“a” 1+ 0 0 begin with
max(0, 0)max(0, 0) max(0, 0)
the same
letter.
“” 0 0 0 0 0
Rule 3 here, since substrings begin with different letters.

Let s1= “abde”, s2= “ada”, and consider its
corresponding LookUp table.
“ada” 2 1 1 0 0
“da” 1 1 1 0 0
“a” 1 0 0 0 0
“” 0 0 0 0 0
#include<string>
#include<vector>
class subsequence
{
public:
// Constructor
subsequence(string s, string ss); // Initialises the strings; and
// the LookUp table
int lcss( int i, int j ); // Computes the length of the longest common
// subsequence of s1.substr(i, L1-1) and
//s2.substr(j, L2-1), and all the LookUp entries.
private :
string s1;
string s2;
int L1; // Length of s1
int L2; // Length of s2
int LookUp[L1+1][L2+1];
};
A recursive solution.
int subsequence::lcss(int i, int j) {

if (LookUp[i][j] == -1 ) { // ... if LookUp[i][j] has not been computed, compute
it ...
if (i >= s1.length() || j >= s2.length()) LookUp[i][j] = 0; // Apply Rule (1).
else {
if (s1[i] == s2 [j]) { int t= lcss(i+1, j+1); // Apply Rule (2)...
// but look up lcss(i+1, j+1)
LookUp[i][j]= t+1;
}
else { int t1= lcss(i+1, j); // Apply Rule (3) ... but look up lcss(i+1, j)
int t2= lcss(i, j+1); // lookup lcss(i, j+1)...
if (t1 > t2) LookUp[i][j]= t1;
else LookUp[i][j]= t2;
}
}
}
return LookUp[i][j]; // In either case, just return the computed value of
// LookUp[i][j]
}
This is O(n^2), where n is the maximum of L1 and L2. (Why?)

In fact we can get rid of the recursion altogether
if we’re careful about how to fill in the LookUp
table.
We notice that LookUp[i][j] only depends on

EITHER LookUp[i+1][j] OR LookUp[i][j+1] OR
LookUp[i+1][j+1].
If we fill the array from bottom-to-top, right-to-

left then we’ll always have the items we need to
hand.
“ada” 2 1 1 0 0
“da” 1 1 1 0 0
“a” 1 0 0 0 0
“” 0 0 0 0 0
An iterative solution.
int subsequence::lcss(int i, int j) {

if (LookUp[i][j] == -1 ) { // ... if LookUp[i][j] has not been computed, compute
it ...
for (int h= s1.length(); h >= i; h--)
for (int v= s2.length(); v >= j; v--) {
if ( h >= s1.length() || v>= s2.length()) LookUp[h][v]= 0; // Rule 1..
else if (s1[h] == s2 [v]) LookUp[h][v]= 1+ LookUp[h+1][v+1]; // Rule 2..
else { if (LookUp[h+1][v] > LookUp[h][v+1]) LookUp[h][v]= LookUp[h+1][v];
else LookUp[h][v]= LookUp[h][v+1]; // Rule 3..
}
}
}
return LookUp[i][j];
}
Tree data types:
• Binary search trees;
• Quadtrees.
Trees can be used to represent parent/child
relationships between data.
Trees consist of nodes connected by edges or arcs.

The connections are directional, and there are no
“loops”.
a Trees are hierarchical so

The ancestor of all that if c is between a and b,
the nodes in a tree then c is the child of a and the
is the root, and has c parent of b. Also b is a
no parents. A node descendent of a, and a is an
without children is ancestor of b.
called a leaf. b
Nodes may have several children. Trees such that all
nodes have at most two children are called binary
trees, and we’ll be studying them for a while.
A tree T is a binary tree if.
EITHER T is empty,
OR T is not empty and is a subset of nodes such that

(a) Exactly one node is the root;
(b) All the other nodes are partitioned into two
disjoint subsets of descendents called the left
subtree, and the right subtree; each subtree is a
binary tree.
Some definitions
Let H(T) be the height of a binary tree, which we

define as follows.
H(T)= 0, if T is empty;
H(T)= 1 + maximum(H(T_1), H(T_2)), where
T_1 and T_2 are the subtrees of T.
Roughly speaking, the height of the tree is the number

of “levels” when a tree is drawn neatly, with all nodes
of the same generation on the same level.
More definitions.
A binary tree is said to be full, if all nodes on level less

than H(T) have two children each.
Roughly speaking a full binary tree has no “missing

nodes”.
A binary tree is balanced, if the height of any node’s

right subtree differs from the height of its left subtree
by no more than 1.
Tree traversals.
Given a binary tree, we will be processing the items

inside of it. To do it we will need to be able to “visit”
each item. There are three ways to traverse a binary
tree (ie visit each item), and we call them preorder,
postorder or inorder traversals.
preorder traversal: each node is “processed” before

the nodes in its subtrees;
postorder traversal: each node is “processed” after

the nodes in its subtrees;
inorder traversal: each node is “processed” in

between the nodes of its left and right subtrees.
Suppose that “processing” means “printing out”.
preorder output: a b d e c f
postorder output: d e b f c a
inorder output: d b e a c f
b c
d e f
Programming trees.
We may represent a tree in a C++ program as an

array, or in a pointer-based representation. For the
time being we’ll use a pointer-based representation.
struct treeNode { For a pointer-based implementation

in C++, we use the same idea as for
int item; linked lists, to create a struct
treeNode* LChildPtr; containing the data (char, string, int
treeNode* RChildPtr; etc.) together with (this time) two
}; “links”, one for the left child and one
for the right child.
Suppose that we already have a tree constucted. We
can implement preorder, postorder and inorder
traversals using a simple recursion.
void preorder (treeNode* T) { Current node’s data

if (T != NULL) { printed first, before
cout << T-> item; the left/right
preorder(T -> LChildPtr); subtrees’ nodes.
preorder(T -> RChildPtr);}
}
void postorder (treeNode* T) { Current node’s data
if (T != NULL) { printed after the left/
postorder(T -> LChildPtr); right subtrees’
postorder(T -> RChildPtr); nodes.
cout << T-> item;}
}
void inorder (treeNode* T) { Current node’s data

if (T != NULL) { printed after the left
inorder(T -> LChildPtr); subtree’s nodes, and
cout << T-> item; before the right
inorder(T -> RChildPtr);} subtree’s nodes.
}
Suppose that “processing” means “printing out”.
preorder output: a b d e c f
postorder output: d e b f c a
inorder output: d b e a c f
b c
d e f
Binary search trees, are a special type of binary tree
in which searching is easy, because the nodes are all
ordered relative to eachother. (Carrano, page
536--574)
A binary search tree (BST) is defined as follows.

T is a binary search tree, if it is a binary tree, and the
following “BST conditions” apply.
(a) T’s root item is greater than all the node items of
its left subtree, and
(b) T’s root item is less than all the node items of its
right subtree, and
(c) both of T’s left and right subtrees are binary

search trees (so that (a) and (b) apply to their roots.)
f
b g
a d h
Here’s an example of a binary search tree.
Where is the smallest item found?

Where is the greatest item found?
What does an inorder traversal produce?
Next we’ll consider how to search a binary search
tree.
The basic idea is given a search key K and a tree T, we

can recursively search the tree, just as in binary
search, exploring either the left subtree or the right
subtree according to whether K is less than
depending on whether K is less than, or greater than
the value at the current node.
void search (treeNode* TreePtr, int K, bool& success) {
// POST: sets success to true if K is in the tree, and false otherwise
if (TreePtr == NULL ) { success= false;}
else if (K == TreePtr-> item ) {success= true;}

else if (K < TreePtr-> item){search(TreePtr -> LChildPtr, K, success );}
else { search (TreePtr -> RChildPtr, K, success);}
}
The complexity of this algorithm depends on the shape of the tree.

(Why?)
Next we’ll consider how to build a binary search
tree.
The basic idea is that we’ll start from an empty tree

and then just add items, making sure that the BST
conditions are maintained.
To do that suppose we already have a binary search

tree T, and we want to add an item, x. We have to
decide where in the tree to put it. (We’ll consider
how to put it after we’ve figured out where to put it.)

Dynamic Programming - Longest Common Subsequence (LCS)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dynamic Programming - Longest Common Subsequence (LCS)

Uploaded by

Copyright:

Available Formats

Programming strategies

This strategy is appropriate when many values are

• Divide and conquer;

This strategy is appropriate when the problem may be

Computing the Fibonacci numbers; longest common

• Divide and conquer;

Binary search; fast power;...

In this example we can do

A subsequence of a string is obtained by deleting any

eg. Let s1= “abdebcbb”, s2= “adacbcb”, then

abdebcbb adacbcb adbcb

The obvious “brute force” way is as follows, and is very

1. Compute all the substrings of s1 and s2;

2. Compare each pair of substrings.

Let L1= length(s1) and L2= length(s2).

There are 2^(L1) substrings of s1, and 2^(L2) substrings

First of all we’ll look for a different approach, and then

Case 2: Both s1 and s2 is begin with the same letter (eg

Case 3: Strings s1 and s2 begin with different letters

Recall that we’re designing a program to compute only

We’ll build a table to store the results in case they are

The value in LookUp[i][j]= the length of the longest

This means once we have filled the table we will be

We start filling the table up at indices (L1, j) and (i, L2).

Rule 1: If either s1.substr(i, L1-1) or s2.substr(j, L2-1) is

Rule 2: Both s1.substr(i, L1-1) and s2.substr(j, L2-1)

Rule 3: Substrings s1.substr(i, L1-1) and s2.substr(j, L2-1)

Rule 3 here, since substrings begin with different letters.

“abde” “bde” “de” “e” “”

int subsequence::lcss(int i, int j) {

This is O(n^2), where n is the maximum of L1 and L2. (Why?)

We notice that LookUp[i][j] only depends on

If we fill the array from bottom-to-top, right-to-

int subsequence::lcss(int i, int j) {

• Binary search trees;

Trees consist of nodes connected by edges or arcs.

a Trees are hierarchical so

A tree T is a binary tree if.

OR T is not empty and is a subset of nodes such that

Let H(T) be the height of a binary tree, which we

Roughly speaking, the height of the tree is the number

A binary tree is said to be full, if all nodes on level less

Roughly speaking a full binary tree has no “missing

A binary tree is balanced, if the height of any node’s

Given a binary tree, we will be processing the items

preorder traversal: each node is “processed” before

postorder traversal: each node is “processed” after

inorder traversal: each node is “processed” in

We may represent a tree in a C++ program as an

struct treeNode { For a pointer-based implementation

void preorder (treeNode* T) { Current node’s data

void inorder (treeNode* T) { Current node’s data

A binary search tree (BST) is defined as follows.

(c) both of T’s left and right subtrees are binary

Here’s an example of a binary search tree.

Where is the smallest item found?

The basic idea is given a search key K and a tree T, we

if (TreePtr == NULL ) { success= false;}

else if (K == TreePtr-> item ) {success= true;}

The complexity of this algorithm depends on the shape of the tree.

The basic idea is that we’ll start from an empty tree

To do that suppose we already have a binary search

You might also like