You are on page 1of 43

ALGORITHM

ANALYSIS
Prepared By:
Belete G. &
Habitamu A.
Course Contents
1. Divide and Conquer
2. Merge sort Implementation techniques
3. Greedy algorithm
4. Dynamic algorithm
5. Backtracking algorithm
6. Branch and Bounce
7. PRIM'S ALGORITHM
8. Huffman coding
9. String matching
10. Knapsack algorithm
11. KRUSKAL’S ALGORITHM
Introduction
● In an algorithm design there is no one 'silver bullet' that is a cure for all
computation problems. Different problems require the use of different
kinds of techniques. A good programmer uses all these techniques
based on the type of problem. Some commonly-used techniques are:
○ Divide and conquer
○ Randomized algorithms
○ Greedy algorithms (This is not an algorithm, it is a technique.)
○ Dynamic programming
1. Divide and Conquer
● The most-well known algorithm design strategy
● Given a function to compute on ‘n’ inputs the divide-and-conquer
strategy suggests splitting the inputs into ‘k’ distinct subsets, 1<k<=n,
yielding ‘k’ sub problems.
● These sub problems must be solved, and then a method must be
found to combine sub solutions into a solution of the whole.
● If the sub problems are still relatively large, then the divide-and-
conquer strategy can possibly be reapplied.
● Often the sub problems resulting from a divide-and-conquer design
are of the same type as the original problem.
Divide and Conquer Cont’d
● Divide-and-conquer, breaks a problem into subproblems that are similar to the
original problem, recursively solves the subproblems, and finally combines the
solutions to the subproblems to solve the original problem.
● Because divide-and-conquer solves subproblems recursively, each subproblem
must be smaller than the original problem, and there must be a base case for
subproblems.
● You should think of a divide-and-conquer algorithm as having three parts:
○ Divide the problem into a number of subproblems that are smaller
instances of the same problem.
○ Conquer the subproblems by solving them recursively. If they are small
enough, solve the subproblems as base cases.
○ Combine the solutions to the subproblems into the solution for the original
problem.
Divide and Conquer Cont’d

Divide-and-conquer creates at least two


subproblems, a divide-and-conquer
algorithm makes multiple recursive calls.
General Algorithm

A divide and conquer algorithm is a strategy of


solving a large problem by

1. breaking the problem into smaller sub-


problems
2. solving the sub-problems, and
3. combining them to get the desired output.
● To use the divide and conquer algorithm,

recursion is used.
Divide and Conquer Cont’d
The following are some standard algorithms that follows Divide and Conquer algorithm.
1. Closest Pair of Points The problem is to find the closest pair of points in a set of
points in x-y plane. The problem can be solved in O(n^2) time by calculating
distances of every pair of points and comparing the distances to find the minimum.
The Divide and Conquer algorithm solves the problem in O(nLogn) time.
2. Strassen’s Algorithm is an efficient algorithm to multiply two matrices. A simple
method to multiply two matrices need 3 nested loops and is O(n^3). Strassen’s
algorithm multiplies two matrices in O(n^2.8974) time.
3. Cooley–Tukey Fast Fourier Transform (FFT) algorithm is the most common
algorithm for FFT. It is a divide and conquer algorithm which works in O(nlogn) time.
4. Karatsuba algorithm for fast multiplication it does multiplication of two n-digit
numbers in at most
Merge Sort
DAC and Merge Sort
T(n) = aT(n/b) + f(n), T(n) = aT(n/b) + f(n)
where, = 2T(n/2) + O(n)
n = size of input Where,
a = number of subproblems in the a = 2 (each time, a problem is divided
recursion into 2 subproblems)
n/b = size of each subproblem. All
n/b = n/2 (size of each sub problem is
subproblems are assumed to have the
same size.
half of the input)
f(n) = time taken to divide the
f(n) = cost of the work done problem and merging the subproblems
outside the recursive call, which T(n/2) = O(n log n) (To understand
includes the cost of dividing the this, please refer to the master
theorem.)
problem and cost of merging the
solutions Now, T(n) = 2T(n log n) + O(n)
≈ O(n log n)
Pros and Cons of DAC
Advantage Disadvantage

● Best performance for ● some cases outweighs at


difficult problem solutions recursion
like Tower of heni ● More complicated than a
● Use caches memory hen basic iterative approach
the sub problems become ● Similar sub problem can
simple enough occur many times
● Easy for parallelism
Merge Sort Implementation C/C++
while(i <= mid) {
void merge(int *Arr, int start, int mid, int end) { temp[k] = Arr[i];
k += 1; i += 1;
int temp[end - start + 1];
}
int i = start, j = mid+1, k = 0; while(j <= end) {
while(i <= mid && j <= end) { temp[k] = Arr[j];
if(Arr[i] <= Arr[j]) { k += 1; j += 1;
temp[k] = Arr[i]; }
for(i = start; i <= end; i += 1) {
k += 1; i += 1; Arr[i] = temp[i - start]
} }
else { }
temp[k] = Arr[j]; void mergeSort(int *Arr, int start, int end) {
k += 1; j += 1; if(start < end) {
} int mid = (start + end) / 2;
} mergeSort(Arr, start, mid);
mergeSort(Arr, mid+1, end);
merge(Arr, start, mid, end);
}
}
Conclusion
● Analyzing of problem is done using recursion
● The following computer algorithms are based on divide-and-
conquer programming approach
○ Merge Sort
○ Quick Sort
○ Binary Search
○ Strassen's Matrix Multiplication
○ Closest pair (points)
Greedy Algorithm
● A greedy algorithm, as the name suggests, always makes the choice that seems to be the
best at that moment. This means that it makes a locally-optimal choice in the hope that
this choice will lead to a globally-optimal solution.
● A greedy algorithm is a simple, intuitive algorithm that is used in optimization problems.
● The algorithm makes the optimal choice at each step as it attempts to find the overall
optimal way to solve the entire problem.
● Greedy algorithms are quite successful in some problems, such as Huffman encoding
which is used to compress data, or Dijkstra's algorithm, which is used to find the shortest
path through a graph.
● An optimization problem is one in which you want to find, not just a solution, but the best
solution
● A “greedy algorithm” sometimes works well for optimization problems
● A greedy algorithm works in phases. At each phase:
○ You take the best you can get right now, without regard for future consequences
○ You hope that by choosing a local optimum at each step, you will end up at a global
optimum
Greedy Algorithm Example Counting Money

● Counting 179 birr with 50 cents

a. Count 100 birr


b. Count 50 birr
c. Count 20 birr
d. Count Nine Coins
e. Count one 50 cent
i. This is the optimization
Greedy Algorithm Example Counting Money
● In some (fictional) monetary system, “krons” come in 1 kron, 7 kron, and 10 kron coins
● Using a greedy algorithm to count out 15 krons, you would get

a. A 10 kron piece
b. Five 1 kron pieces, for a total of 15 krons
c. This requires six coins
● A better solution would be to use two 7 kron pieces and one 1 kron piece

a. This only requires three coins


● The greedy algorithm results in a solution, but not in an optimal solution
Greedy Algorithm: Scheduling Problem
● You have to run nine jobs, with running times of 3, 5, 6, 10, 11, 14, 15, 18, and 20 minutes
● You have three processors on which you can run these jobs
● You decide to do the longest-running jobs first, on whatever processor is available
● Process One

● Process Two

● Process Three
● Approach 2
● Approach 1 ○ What would be the result if you ran the shortest job
○ Time to completion: 18 + 11 + 6 first?
= 35 minutes ○ Again, the running times are 3, 5, 6, 10, 11, 14, 15,
○ This solution isn’t bad, but we 18, and 20 minutes
might be able to do better ○ That wasn’t such a good idea; time to completion is
now 6 + 14 + 20 = 40 minutes
○ Note, however, that the greedy algorithm itself is
fast
○ All we had to do at each stage was pick the minimum
Greedy Algorithm: Scheduling Problem
● Optimal Solution
○ This solution is clearly optimal (why?)
○ Clearly, there are other optimal solutions (why?)
○ How do we find such a solution?
○ One way: Try all possible assignments of jobs to processors
○ Unfortunately, this approach can take exponential time
● Process One

● Process Two

● Process Three
Application of Greedy
● Dijkstra’s algorithm
● Minimal spanning tree in a graph using prim’s and
kruskal algorithms
● Analysis of electrical circuit
● TSP
● KNAPSACK PROBLEM
Dynamic Programming
● Dynamic Programming is mainly an optimization over plain recursion.
● Wherever we see a recursive solution that has repeated calls for same
inputs, we can optimize it using Dynamic Programming.
● The idea is to simply store the results of subproblems, so that we do not
have to re-compute them when needed later. This simple optimization
reduces time complexities from exponential to polynomial.
● For example, if we write simple
recursive solution for
Fibonacci Numbers, we get
exponential time complexity and if
we optimize it by storing solutions
of subproblems, time complexity
reduces to linear.
Dynamic Programming
● Majority of the Dynamic Programming problems can be categorized into
two types:
○ Optimization problems.
○ Combinatorial problems.
● The optimization problems expect you to select a feasible solution, so
that the value of the required function is minimized or maximized.
● Combinatorial problems expect you to figure out the number of ways to
do something, or the probability of some event happening.
Dynamic Programming
● Dynamic programming is a way of improving on inefficient divide and-
conquer algorithm By “inefficient”, we mean that the same recursive call is
made over and over
● Dynamic programming is applicable when the subproblems are dependent,
that is, when subproblems share subsolution
● Main idea:
○ set up a recurrence relating a solution to a larger instance to solutions of
some smaller instances
○ solve smaller instances once
○ record solutions
○ extract solution for similar subproblem
Dynamic Programming
● DP is used to solve problems with the f/f characteristics:
○ Simple subproblems
■ We should be able to break the original problem to smaller
subproblems that have the same structure
○ Optimal substructure of the problems
■ The optimal solution to the problem contains within optimal
solutions to its subproblems.
○ Overlapping subproblems
■ there exist some places where we solve the same subproblem
more than once.
Approaches of DP
● Tabulation: Bottom Up approach
○ It starts by solving the lowest level subproblem.
○ Find the bottom next subproblem from lowest level subproblem
○ From completing all subproblem in order of least to upper give solution.
● Memoization: Top Down approach
● Memoization refers to the technique of caching and reusing previously
computed results
● It starts with the highest-level subproblems (the ones closest to the
original problem), and recursively calls the next subproblem, and the
next.
Example Dynamic Programming
// Tabulated version to find factorial x.

int dp[MAXN];

// base case

int dp[0] = 1;

for (int i = 1; i< =n; i++)

dp[i] = dp[i-1] * i;

}
Backtracking Algorithms
● Backtracking is an algorithmic-technique for solving problems recursively by trying

to build a solution incrementally, one piece at a time, removing those solutions that

fail to satisfy the constraints of the problem at any point of time (by time, here, is

referred to the time elapsed till reaching any level of the search tree).
● For example, consider the SudoKo solving Problem, we try

filling digits one by one. Whenever we find that current

digit cannot lead to a solution, we remove it (backtrack)

and try next digit.


● This is better than naive approach (generating all possible

combinations of digits and then trying every combination

one by one) as it drops a set of permutations whenever it

backtracks.
Backtracking Algorithms
● Backtracking is an algorithmic technique where the goal is to get all solutions to a problem
using the brute force approach.
○ It consists of building a set of all the solutions incrementally. Since a problem would have
constraints, the solutions that fail to satisfy them will be removed.
● It uses recursive calling to find a solution set by building a solution step by step, increasing
levels with time. In order to find these solutions, a search tree named state-space tree is used.
In a state-space tree, each branch is a variable, and each level represents a solution.
● A backtracking algorithm uses the depth-first search method. When it starts exploring the
solutions, a bounding function is applied so that the algorithm can check if the so-far built
solution satisfies the constraints.
○ If it does, it continues searching. If it doesn’t, the branch would be eliminated, and the
algorithm goes back to the level before.
Backtracking Algorithms
● The backtracking algorithm is applied to some
specific types of problems. For instance, we can use it
to find a feasible solution to a decision problem. It
was also found to be very effective for
optimization problems.
● For some cases, a backtracking algorithm is used for
the enumeration problem in order to find the set of
all feasible solutions for the problem.
● On the other hand, backtracking is not considered an
optimized technique to solve a problem. It finds its
application when the solution needed for a problem is
not time-bounded.
● It is true that for this problem, the solutions found are
valid, but still, a backtracking algorithm for the -
Queens problem presents a time complexity equal to
● Backtracking remains a valid and vital tool for solving
various kinds of problems, even though this
algorithm’s time complexity may be high, as it may
need to explore all existing solutions.
Branch and Bound
● Branch and bound is an algorithm design paradigm which is generally
used for solving combinatorial optimization problems.
● These problems are typically exponential in terms of time complexity
and may require exploring all possible permutations in worst case.
● The Branch and Bound Algorithm technique solves these problems
relatively quickly.
○ Breadth first search
● Branch and bound is more suitable for situations where we cannot
apply the greedy method and dynamic programming.
Branch and Bound
● Usually, this algorithm is slow as it requires exponential time
complexities during the worst case, but sometimes it works with
reasonable efficiency.
● However, this method helps to determine global optimization in
non-convex problems.
● Branch and Bound, is an algorithm to find optimal solutions to many
optimization problems, especially in discrete and combinatorial
optimization.
○ Example: Job Sequence with reasonable time
Knapsack Problem
● The Knapsack Problem is a famous Dynamic Programming Problem that falls in the
optimization category.
● It derives its name from a scenario where, given a set of items with specific weights
and assigned values, the goal is to maximize the value in a knapsack while remaining
within the weight constraint.
● Given weights and values of n items, put these items in a knapsack of capacity W to
get the maximum total value in the knapsack.
● In other words, given two integer arrays val[0..n-
1] and wt[0..n-1] which represent values and
weights associated with n items respectively.
● Also given an integer W which represents
knapsack capacity, find out the maximum value
subset of val[] such that sum of the weights of
this subset is smaller than or equal to W.
● You cannot break an item, either pick the
complete item or don’t pick it (0-1 property).
Prim’s Algorithms
● Prim’s Algorithm is a famous greedy algorithm.
● It is used for finding the Minimum Spanning Tree (MST) of a given graph.
● To apply Prim’s algorithm, the given graph must be weighted, connected and undirected.
● Steps
○ Step-01:
■ Randomly choose any vertex.
■ The vertex connecting to the edge having least weight is usually selected.
○ Step-02:
■ Find all the edges that connect the tree to new vertices.
■ Find the least weight edge among those edges and include it in the existing tree.
■ If including that edge creates a cycle, then reject that edge and look for the next least weight edge.
○ Step-03:
■ Keep repeating step-02 until all the vertices are included and Minimum Spanning Tree (MST) is obtained.
Prim's Algorithm Time Complexity
● If adjacency list is used to represent the graph, then using breadth first search, all the vertices
can be traversed in O(V + E) time.
● We traverse all the vertices of graph using breadth first search and use a min heap for storing
the vertices not yet included in the MST.
● To get the minimum weight edge, we use min heap as a priority queue.
● Min heap operations like extracting minimum element and decreasing key value takes
O(logV) time. So, overall time complexity
= O(E + V) x O(logV)
= O((E + V)logV)
= O(ElogV)
● This time complexity can be improved and reduced to O(E + VlogV) using Fibonacci heap.
Example
● Construct the minimum spanning tree (MST) for the given graph using Prim’s
Algorithm
Huffman Coding
● Huffman Coding is a famous Greedy Algorithm.
● It is used for the lossless compression of data.
● It uses variable length encoding.
● It assigns variable length code to all the characters.
● The code length of a character depends on how frequently it occurs in
the given text.
● The character which occurs most frequently gets the smallest code.
● The character which occurs least frequently gets the largest code.
Steps of Huffman Coding
● There are two major steps in Huffman Coding-
○ Building a Huffman Tree from the input characters.
○ Assigning code to the characters by traversing the Huffman Tree.
● Step-01:
○ Create a leaf node for each character of the text.
○ Leaf node of a character contains the occurring frequency of that character.
● Step-02:
○ Arrange all the nodes in increasing order of their frequency value.
● Step-03:
○ Considering the first two nodes having minimum frequency,
■ Create a new internal node.
■ The frequency of this new node is the sum of frequency of those two nodes.
■ Make the first node as a left child and the other node as a right child of the
newly created node.
● Step-04:
○ Keep repeating Step-02 and Step-03 until all the nodes form a single tree.
○ The tree finally obtained is the desired Huffman Tree.
Huffman Tree construcion

You might also like