Professional Documents
Culture Documents
Dynamic Programming
Dynamic Programming is a technique in computer programming that helps
to efficiently solve a class of problems that have overlapping subproblems
and optimal substructure property.
If any problem can be divided into subproblems, which in turn are divided
into smaller subproblems, and if there are overlapping among these
subproblems, then the solutions to these subproblems can be saved for
future reference. In this way, efficiency of the CPU can be enhanced. This
method of solving a solution is referred to as dynamic programming.
Let's find the fibonacci sequence upto 5th term. A fibonacci series is the
sequence of numbers in which each number is the sum of the two
preceding ones. For example, 0,1,1, 2, 3 . Here, each number is the sum of
the two preceding numbers.
Algorithm
1. If n <= 1, return 1.
3. The third term is sum of 0 (from step 1) and 1(from step 2), which is 1.
4. The fourth term is the sum of the third term (from step 3) and second term
(from step 2) i.e. 1 + 1 = 2.
5. The fifth term is the sum of the fourth term (from step 4) and third term
(from step 3) i.e. 2 + 1 = 3.
Hence, we have the sequence 0,1,1, 2, 3 . Here, we have used the results
of the previous steps as shown below. This is called a dynamic
programming approach.
F(0) = 0
F(1) = 1
var m = map(0 → 0, 1 → 1)
function fib(n)
return m[n]
If S1 and S2 are the two given sequences then, Z is the common
subsequence of S1 and S2 if Z is a subsequence of both S1 and S2 .
Furthermore, Z must be a strictly increasing sequence of the indices of
both S1 and S2 .
In a strictly increasing sequence, the indices of the elements chosen from
the original sequences must be in ascending order in Z .
If
S1 = {B, C, D, A, A, C, D}
If
S1 = {B, C, D, A, A, C, D}
S2 = {A, C, D, B, A, C}
Then, common subsequences are {B, C}, {C, D, A, C}, {D, A, C}, {A, A,
Second Sequence
The following steps are followed for finding the longest common
subsequence.
3. If the character correspoding to the current row and current column are
matching, then fill the current cell by adding one to the diagonal element.
Point an arrow to the diagonal cell.
4. Else take the maximum value from the previous column and previous row
element for filling the current cell. Point an arrow to the cell with maximum
value. If they are equal, point to any of them.
7. In order to find the longest common subsequence, start from the last
element and follow the direction of the arrow. The elements corresponding
to () symbol form the longest common subsequence.
Create a path according to the arrows
LCS
In the above dynamic algorithm, the results obtained from each comparison
between elements of X and the elements of Y are stored in a table so that
they can be used in future computations.
So, the time taken by a dynamic approach is the time taken to fill the table
(ie. O(mn)). Whereas, the recursion algorithm has the complexity of 2 max(m, n)
.
Greedy Algorithm
A greedy algorithm is an approach for solving a problem by selecting the
best option available at the moment. It doesn't worry whether the current
best result will bring the overall optimal result.
The algorithm never reverses the earlier decision even if the choice is
wrong. It works in a top-down approach.
This algorithm may not produce the best result for all the problems. It's
because it always goes for the local best choice to produce the global best
result.
However, we can determine if the algorithm can be used with any problem
if the problem has the following properties:
2. Optimal Substructure
If the optimal overall solution to the problem corresponds to the optimal
solution to its subproblems, then the problem can be solved using a greedy
approach. This property is called optimal substructure.
Advantages of Greedy Approach
For example, suppose we want to find the longest path in the graph below
from root to leaf. Let's use the greedy algorithm here.
Greedy Approach
1. Let's start with the root node 20. The weight of the right child is 3 and the
weight of the left child is 2.
2. Our problem is to find the largest path. And, the optimal solution at the
moment is 3. So, the greedy algorithm will choose 3.
3. Finally the weight of an only child of 3 is 1. This gives us our final
result 20 + 3 + 1 = 24 .
However, it is not the optimal solution. There is another path that carries
more weight ( 20 + 2 + 10 = 32 ) as shown in the image below.
Longest path
Greedy Algorithm
2. At each step, an item is added to the solution set until a solution is reached.
Huffman Coding is generally useful to compress the data in which there are
frequently occurring characters.
Initial string
Huffman coding first creates a tree using the frequencies of the character
and then generates code for each character.
Huffman Coding prevents any ambiguity in the decoding process using the
concept of prefix code ie. a code associated with a character should not
be present in the prefix of any other code. The tree created above helps in
maintaining the property.
Huffman coding is done with the help of the following steps.
Frequency of string
2. Sort the characters in increasing order of the frequency. These are stored
4. Create an empty node z . Assign the minimum frequency to the left child of
z and assign the second minimum frequency to the right child of z . Set the
value of the z as the sum of the above two minimum frequencies.
For sending the above string over a network, we have to send the tree as
well as the above compressed-code. The total size is given by the table
below.
A 5 11 5*2 = 10
B 1 100 1*3 = 3
C 6 0 6*1 = 6
D 3 101 3*3 = 9
Without encoding, the total size of the string was 120 bits. After encoding
the size is reduced to 32 + 15 + 28 = 75 .
Decoding the code
For decoding the code, we can take the code and traverse through the tree
to find the character.
Let 101 is to be decoded, we can traverse from the root as in the figure
below.
Decoding
create a newNode
calculate the sum of these two minimum values and assign it to the value
of newNode
return rootNode
Python
Java
C++
string = 'BCAADDDCCACACAC'
def children(self):
return (self.left, self.right)
def nodes(self):
return (self.left, self.right)
def __str__(self):
return '%s_%s' % (self.left, self.right)
# Main function implementing huffman coding
def huffman_code_tree(node, left=True, binString=''):
if type(node) is str:
return {node: binString}
(l, r) = node.children()
d = dict()
d.update(huffman_code_tree(l, True, binString + '0'))
d.update(huffman_code_tree(r, False, binString + '1'))
return d
# Calculating frequency
freq = {}
for c in string:
if c in freq:
freq[c] += 1
else:
freq[c] = 1
nodes = freq
huffmanCode = huffman_code_tree(nodes[0][0])
The time complexity for encoding each unique character based on its frequency
is O(nlog n) .
Extracting minimum frequency from the priority queue takes place 2*(n-1) times
and its complexity is O(log n) . Thus the overall complexity is O(nlog n) .
Huffman Coding Applications