You are on page 1of 18

Unit -4

Dynamic Programming
Dynamic Programming is a technique in computer programming that helps
to efficiently solve a class of problems that have overlapping subproblems
and optimal substructure property.
If any problem can be divided into subproblems, which in turn are divided
into smaller subproblems, and if there are overlapping among these
subproblems, then the solutions to these subproblems can be saved for
future reference. In this way, efficiency of the CPU can be enhanced. This
method of solving a solution is referred to as dynamic programming.

Such problems involve repeatedly calculating the value of the same


subproblems to find the optimum solution.

Dynamic Programming Example

Let's find the fibonacci sequence upto 5th term. A fibonacci series is the
sequence of numbers in which each number is the sum of the two
preceding ones. For example,  0,1,1, 2, 3 . Here, each number is the sum of
the two preceding numbers.
Algorithm

Let n be the number of terms.

1. If n <= 1, return 1.

2. Else, return the sum of two preceding numbers.


We are calculating the fibonacci sequence up to the 5th term.

1. The first term is 0.

2. The second term is 1.

3. The third term is sum of 0 (from step 1) and 1(from step 2), which is 1.

4. The fourth term is the sum of the third term (from step 3) and second term
(from step 2) i.e.  1 + 1 = 2.

5. The fifth term is the sum of the fourth term (from step 4) and third term
(from step 3) i.e.  2 + 1 = 3.

Hence, we have the sequence  0,1,1, 2, 3 . Here, we have used the results
of the previous steps as shown below. This is called a dynamic
programming approach.

F(0) = 0

F(1) = 1

F(2) = F(1) + F(0)

F(3) = F(2) + F(1)

F(4) = F(3) + F(2)

How Dynamic Programming Works

Dynamic programming works by storing the result of subproblems so that


when their solutions are required, they are at hand and we do not need to
recalculate them.
This technique of storing the value of subproblems is called memoization.
By saving the values in the array, we save time for computations of sub-
problems we have already come across.

var m = map(0 → 0, 1 → 1)

function fib(n)

if key n is not in map m

m[n] = fib(n − 1) + fib(n − 2)

return m[n]

Longest Common Subsequence


The longest common subsequence (LCS) is defined as the longest
subsequence that is common to all the given sequences, provided that the
elements of the subsequence are not required to occupy consecutive
positions within the original sequences.

If  S1  and  S2  are the two given sequences then,  Z  is the common
subsequence of  S1  and  S2  if  Z  is a subsequence of both  S1  and  S2 .
Furthermore,  Z  must be a strictly increasing sequence of the indices of
both  S1  and  S2 .
In a strictly increasing sequence, the indices of the elements chosen from
the original sequences must be in ascending order in  Z .
If

S1 = {B, C, D, A, A, C, D}

Then,  {A, D, B}  cannot be a subsequence of  S1  as the order of the


elements is not the same (ie. not strictly increasing sequence).
Let us understand LCS with an example.

If

S1 = {B, C, D, A, A, C, D}

S2 = {A, C, D, B, A, C}

Then, common subsequences are  {B, C}, {C, D, A, C}, {D, A, C}, {A, A,

C}, {A, C}, {C, D},   ...

Among these subsequences,  {C, D, A, C}  is the longest common


subsequence. We are going to find this longest common subsequence
using dynamic programming.
Before proceeding further, if you do not already know about dynamic
programming, please go through dynamic programming.

Using Dynamic Programming to find the LCS

Let us take two sequences:

The first sequence

Second Sequence
The following steps are followed for finding the longest common
subsequence.

1. Create a table of dimension  n+1*m+1  where n and m are the lengths


of  X  and  Y  respectively. The first row and the first column are filled with

zeros. Initialise a table


2. Fill each cell of the table using the following logic.

3. If the character correspoding to the current row and current column are
matching, then fill the current cell by adding one to the diagonal element.
Point an arrow to the diagonal cell.

4. Else take the maximum value from the previous column and previous row
element for filling the current cell. Point an arrow to the cell with maximum
value. If they are equal, point to any of them.

Fill the values


5. Step 2 is repeated until the table is filled.

Fill all the values


6. The value in the last row and the last column is the length of the longest

common subsequence. The bottom right


corner is the length of the LCS

7. In order to find the longest common subsequence, start from the last
element and follow the direction of the arrow. The elements corresponding
to () symbol form the longest common subsequence.
Create a path according to the arrows

Thus, the longest common subsequence is  CA .

LCS

How is a dynamic programming algorithm more efficient than the


recursive algorithm while solving an LCS problem?
The method of dynamic programming reduces the number of function calls.
It stores the result of each function call so that it can be used in future calls
without the need for redundant calls.

In the above dynamic algorithm, the results obtained from each comparison
between elements of  X  and the elements of  Y  are stored in a table so that
they can be used in future computations.
So, the time taken by a dynamic approach is the time taken to fill the table
(ie. O(mn)). Whereas, the recursion algorithm has the complexity of  2 max(m, n)
.
Greedy Algorithm
A greedy algorithm is an approach for solving a problem by selecting the
best option available at the moment. It doesn't worry whether the current
best result will bring the overall optimal result.

The algorithm never reverses the earlier decision even if the choice is
wrong. It works in a top-down approach.

This algorithm may not produce the best result for all the problems. It's
because it always goes for the local best choice to produce the global best
result.

However, we can determine if the algorithm can be used with any problem
if the problem has the following properties:

1. Greedy Choice Property


If an optimal solution to the problem can be found by choosing the best
choice at each step without reconsidering the previous steps once chosen,
the problem can be solved using a greedy approach. This property is called
greedy choice property.

2. Optimal Substructure
If the optimal overall solution to the problem corresponds to the optimal
solution to its subproblems, then the problem can be solved using a greedy
approach. This property is called optimal substructure.
Advantages of Greedy Approach

 The algorithm is easier to describe.


 This algorithm can perform better than other algorithms (but, not in all
cases).

Drawback of Greedy Approach

As mentioned earlier, the greedy algorithm doesn't always produce the


optimal solution. This is the major disadvantage of the algorithm

For example, suppose we want to find the longest path in the graph below
from root to leaf. Let's use the greedy algorithm here.

Apply greedy approach


to this tree to find the longest route

Greedy Approach
1. Let's start with the root node 20. The weight of the right child is 3 and the
weight of the left child is 2.
2. Our problem is to find the largest path. And, the optimal solution at the
moment is 3. So, the greedy algorithm will choose 3.
3. Finally the weight of an only child of 3 is 1. This gives us our final
result  20 + 3 + 1 = 24 .

However, it is not the optimal solution. There is another path that carries
more weight ( 20 + 2 + 10 = 32 ) as shown in the image below.

Longest path

Therefore, greedy algorithms do not always give an optimal/feasible


solution.

Greedy Algorithm

1. To begin with, the solution set (containing answers) is empty.

2. At each step, an item is added to the solution set until a solution is reached.

3. If the solution set is feasible, the current item is kept.

4. Else, the item is rejected and never considered again.


Huffman Coding
Huffman Coding is a technique of compressing data to reduce its size
without losing any of the details. It was first developed by David Huffman.

Huffman Coding is generally useful to compress the data in which there are
frequently occurring characters.

How Huffman Coding works?

Suppose the string below is to be sent over a network.

Initial string

Each character occupies 8 bits. There are a total of 15 characters in the


above string. Thus, a total of  8 * 15 = 120  bits are required to send this
string.
Using the Huffman Coding technique, we can compress the string to a
smaller size.

Huffman coding first creates a tree using the frequencies of the character
and then generates code for each character.

Once the data is encoded, it has to be decoded. Decoding is done using


the same tree.

Huffman Coding prevents any ambiguity in the decoding process using the
concept of prefix code ie. a code associated with a character should not
be present in the prefix of any other code. The tree created above helps in
maintaining the property.
Huffman coding is done with the help of the following steps.

1. Calculate the frequency of each character in the string.

Frequency of string

2. Sort the characters in increasing order of the frequency. These are stored

in a priority queue  Q . Characters


sorted according to the frequency
3. Make each unique character as a leaf node.

4. Create an empty node  z . Assign the minimum frequency to the left child of
z and assign the second minimum frequency to the right child of  z . Set the
value of the  z  as the sum of the above two minimum frequencies.

Getting the sum of the least numbers


5. Remove these two minimum frequencies from  Q  and add the sum into the
list of frequencies (* denote the internal nodes in the figure above).
6. Insert node  z  into the tree.

7. Repeat steps 3 to 5 for all the characters.

Repeat steps 3 to 5 for all the characters. 


Repeat steps 3 to 5 for all the characters.
8. For each non-leaf node, assign 0 to the left edge and 1 to the right edge.

Assign 0 to the left edge and 1 to the right


edge

For sending the above string over a network, we have to send the tree as
well as the above compressed-code. The total size is given by the table
below.

Character Frequency Code Size

A 5 11 5*2 = 10

B 1 100 1*3 = 3

C 6 0 6*1 = 6

D 3 101 3*3 = 9

4 * 8 = 32 bits 15 bits   28 bits

Without encoding, the total size of the string was 120 bits. After encoding
the size is reduced to  32 + 15 + 28 = 75 .
Decoding the code

For decoding the code, we can take the code and traverse through the tree
to find the character.

Let 101 is to be decoded, we can traverse from the root as in the figure
below.

Decoding

Huffman Coding Algorithm

create a priority queue Q consisting of each unique character.

sort then in ascending order of their frequencies.

for all the unique characters:

create a newNode

extract minimum value from Q and assign it to leftChild of newNode


extract minimum value from Q and assign it to rightChild of newNode

calculate the sum of these two minimum values and assign it to the value
of newNode

insert this newNode into the tree

return rootNode

Python, Java and C/C++ Examples

Python

Java

C++

# Huffman Coding in python

string = 'BCAADDDCCACACAC'

# Creating tree nodes


class NodeTree(object):

def __init__(self, left=None, right=None):


self.left = left
self.right = right

def children(self):
return (self.left, self.right)

def nodes(self):
return (self.left, self.right)

def __str__(self):
return '%s_%s' % (self.left, self.right)
# Main function implementing huffman coding
def huffman_code_tree(node, left=True, binString=''):
if type(node) is str:
return {node: binString}
(l, r) = node.children()
d = dict()
d.update(huffman_code_tree(l, True, binString + '0'))
d.update(huffman_code_tree(r, False, binString + '1'))
return d

# Calculating frequency
freq = {}
for c in string:
if c in freq:
freq[c] += 1
else:
freq[c] = 1

freq = sorted(freq.items(), key=lambda x: x[1], reverse=True)

nodes = freq

while len(nodes) > 1:


(key1, c1) = nodes[-1]
(key2, c2) = nodes[-2]
nodes = nodes[:-2]
node = NodeTree(key1, key2)
nodes.append((node, c1 + c2))

nodes = sorted(nodes, key=lambda x: x[1], reverse=True)

huffmanCode = huffman_code_tree(nodes[0][0])

print(' Char | Huffman code ')


print('----------------------')
for (char, frequency) in freq:
print(' %-4r |%12s' % (char, huffmanCode[char]))

Huffman Coding Complexity

The time complexity for encoding each unique character based on its frequency
is  O(nlog n) .

Extracting minimum frequency from the priority queue takes place  2*(n-1)  times
and its complexity is  O(log n) . Thus the overall complexity is  O(nlog n) .
Huffman Coding Applications

 Huffman coding is used in conventional compression formats like GZIP, BZIP2,


PKZIP, etc.

 For text and fax transmissions.

You might also like