Professional Documents
Culture Documents
the second is used if the direct solution is applied for a larger # of cases
the closed form of a recurrence relation may not be simple or neat however it eliminates
the recursive call so that we can quickly compare equations and determine their order
d. Closet pair problem consider a set of N points in space given as p1, p2,…,pn; the closest pair problem looks for the 2 of these points
that are closest in space; the brute force way is calculating the distance from every point to every other point and finding the smallest
distance; the brute force solution is O(n 2), compute the distance between each pair and return the smallest; we can calculate the
smallest distance in O(n log n) time using Divide and Conquer strategy
e. Divide and Conquer Solution to closest pair problem Step 1 (Create 2 lists of the N points; the first will be sorted by increasing
values of the x coordinate; the second will be sorted by increasing values of the y coordinate; the base case for the recursive algorithm
will be when there are 3 or fewer points; if there are more than 3 points, divide the set of points along a vertical line, so that half
points in the left part and half points are in the right part) Step 2 (recursively call the algorithm for the left and right parts until each
part has no more than 3 points; shortest distance in each part can be determined easily; it is possible that the distance between a
point in left part and a point in right part is shorter than d) create a new set of points that have x coordinates in the middle range; find
if there is a pair has shorter distance than d; if there is a pair of points closer than d in any section, those points must be within 8
positions of each other in the middle strip
f. Convex hull—divide and conquer solution sorting the points in order based on the x coordinate; if a group of points has the same x
coordinates, they ordered by y coordinates; the 1st and the last points must be extreme points; use the 2 points to form a line and
divide the other points into 2 sets based on whether they are on the right or the left side; the convex hull will be the line from p1 to pn
along with the upper/lower hull of the entire set of points; upper and lower hulls can be found by a recursive process (input is the line
dividing the larger set of points and the subset of points on one side of the line; find the point that is farthest from the line; connect
the farthest point with the 2 end points of the line; check other points whether they are all inside the triangle; if not, the algorithm is
called recursively with the 1 or 2 sets of points that are outside of the triangle)
3. Chapter 3 Searching and Selection Algorithms
a. Searching a list that contains records of information, stored as an array; list locations will be indexed from 1 to N; records may be
sorted or unsorted based on their key value; if a list is unsorted, search sequentially—look through the list for the item we want (not
very efficient); if a list of elements is sorted, more options—binary search
b. Sequential search assume the list is unsorted; assume the key values are unique; the task for a search algorithm is to identify the
location of the target, so it returns the index in the range 1 to N of where the record is located; return 0 if the target is not in the list;
the further down the list a particular key value is the longer it will take to find that key value
i. Worst-case analysis: case 1 (target is the last element in the list) case 2 (target is not in the list) N comparisons will be taken; N
comparisons is the upper bound for any searching algorithms for a list of N elements
N +1
ii. Average-case analysis: assuming the target is in the list the # of comparisons , the target may not be in the list the # of
2
N +2
comparisons ; including the possibility of the target not being in the list only increases the average case by ½; when we
2
consider this amount relative to the size of the list, ½ is not significant
c. Binary search assume the elements of the list is sorted; compare the target with the element in the middle; the target matches; the
target is less than the element; the target is greater than the element; if no matches, half of the list can be eliminated
i. Worst-case analysis k =lg (N +1)
ii. Decision tree nodes of the decision tree would have the element that is checked at each pass; those elements that would
be checked if the target is less than the current element would go into the left subtree; those would be checked when the
target is greater would go into the right subtree
iii. Average-case analysis case 1 (the target will always be in the list; N possible locations for the target; assume each is
equivalent, so probability = 1/N; 1 comparison is done to find the element that s in the root of the tree on level 1; 2
comparisons are done to find the elements that are in the nodes on level 2; i comparisons are done to find the elements that
are in the nodes on level i; for a binary tree there are 2i−1 nodes on level i when N=2k −1, there are k levels in the tree;
average # of comparisons: lg ( N + 1 )−1) case 2 (the target may not be in the list; there are N+1 of these possibilities; the
target can be smaller than the element in location 1, larger than the element in location 2, larger than the element in location
2 but smaller than the one in location 3, and so on the target is larger than the element in location N; in each case, it takes k
comparisons to learn that the target is not in the list; there are 2N+1 possibilities to include in our calculation; # of
1
comparisons: lg ( N + 1 )− )
2
4. Chapter 4 Sorting Algorithms
a. Insertion sort the basic idea: you have a list that is sorted and add a new element into the list, the list is one element larger, next
insert another element to the larger group, do util all elements are in the group; the 1st element of any list is always as sorted list of
size 1; insert the 2nd element into it; repeated until all of the elements have been put into the expanding sorted portion of the list
i. Worst-case analysis for one pass, the worst case is the new element to be added is smaller than all of the elements already
in the sorted part of the list; the most work the entire algorithm will do is in the case where every new element is added to
the front of the list; the worst case is that the list must be in decreasing order when we start; W ( N )=O ( N 2)
ii. Average-case analysis what is the average # of comparisons needed to move one element into place? Adding the ith
element to the sorted part of the list does at most i comparisons; for the ith element, it will do 1, 2, 3,…,i comparisons for
locations i + 1, i, i-1,…, 2, and it will do i comparisons for location 1; the average # of comparisons to insert the ith element;
needs to be summed up for each of the 1 through N-1 elements that gets “added” to the list; A ( N )=O(N 2)
b. Bubble sort start each of the passes at the beginning of the list and compare the elements in locations 1 and 2, then the elements in
locations 2 and 3, then 3 and 4, and so on, swapping those that are out of order; once the algorithm reaches the largest element, it will
be swapped with all of the remaining elements, moving it to the end of the list after the first pass; the second pass will move the
second largest element down until it is in the second to last location
i. Best-case analysis one the first pass, the for loop must fully execute, and so this algorithm does at least N-1 comparisons; if
there are no swaps, swappedElements will still be false and the algorithm will end; so the best case is N-1 comparisons; this
happens when the data values are already in order
ii. Worst-case analysis we might want to see if having the input data in reverse order will elad us to the worst case; one the
first pass, the largest value is first, it will be swapped with every other element down the list; on the second pass, the second
largest element is now in the first position, and it will be swapped with every other element in the list until it is in the second
to last position; W ( N )=O ( N 2 )
iii. Average-case analysis assume that it is equally likely that on any of these passes there will be no swaps done; we need to
know how many comparisons are done in each of these possibilities; if we stop after 1 pass, we have done N-1 comparisons; if
we stop after 2 passes, we have done N-1+N-2 comparisons; A ( N )=O( N 2)
c. Shellsort begins by considering the full list of values as a set of interleaved sub lists; one the 1st pass, it may deal with sub lists that
are just pairs of elements; on the 2nd pass, it could deal with groups of 4 elements each; the process repeats, increasing the # of
elements per sub list and decreasing the # of sub lists; start with an increment that is 1 less than the largest power of 2 that is smaller
than the size of the list; if the list has 1,000 elements our first increment will be 511; the increment also indicates the # of sub lists; first
sub list has the elements in location 1 and 1+increment, the last sub list has to start in location increment
3
i. Complexity a complete analysis of the shellsort algorithm is very complex; in the worst case is O(N 2 )
ii. Choice of increment the choice of the increment sequence can have a major effect on the order of shellsort; attempts at
finding an optimal increment sequence have not be successful
d. Radix sort does not actually comparing key values to each other; we will create a set of “buckets” and distribute the entries into the
buckets based on their key values; after collecting the values and repeating this process for successive parts of the key, we can create a
sorted list
i. Analysis each key is looked at once for each digit of the longest key; so if the longest key has M digits and there are N keys,
radix sort has order O(MN); M is relatively small; this algorithm is of linear complexity, O(N); if we use arrays for the buckets,
we will need 10N additional space if the keys are numeric; why the size of each array is N? we can’t assume that the keys will
be uniformly distributed among the buckets
e. Heapsort based on a special type of binary tree called a heap; heap (for every subtree the value at the root is larger than all the
values in the 2 children); store a heap as a list; fix a heap (when taking the largest element out of the root and move it to the list, this
leaves the root vacant); remove the largest elements which is the 1st index in the list which also means the root is removed; how do we
maintain a nearly complete tree? The only node we can delete from the tree and still have a nearly complete tree is the last node
i. Operation 1: insert a new element; there is only one place where we can insert a new node and still have a nearly complete
binary tree; the only place the heap property can possibly fail is at the new node; we compare the element with the parent
node and find that the heap property does fail at node
f. Mergesort/Merging 2 sorted arrays the heart of the Mergesort algorithm is the merging of 2 already-sorted arrays; the comparisons
necessary to determine which element will be copied; how do you sort each half? Recursion; you divide the half into 2 quarters, sort
each of the quarters, and merge them to make a sorted half; divide the array again and again until you reach a subarray with only one
element; this is the base case; it’s assumed an array with 1 element is already sorted
i. Analysis breaking the list into 2 sublists and each one is half the size of the original N/2; the combine step will take N/2
comparisons in the best case and N/2 + N/2 -1 comparisons in the worst case; recurrence relations for the worst (W) and best
Regular expressions are associated with language operations: + operator (union) . operator (concatenation) * operator (star closure) ()
[grouping symbols for other operations]; we can determine the language of a regular expression by applying the transformations
i. Regular grammars right-linear grammar generates a regular language; the left-hand side of a production rule can have only a single
nonterminal symbol; the right-hand side can have 0+ terminal symbols but at most 1 non-terminal symbol, the non-terminal must be
the rightmost symbol
j. Designing a regular grammar think of each of the nonterminal symbols as having a meaning or purpose
Quiz 1:
1. Analyzing an algorithm determines the amount of time that algorithm takes to execute, which is the exact # of seconds FALSE
2. To analyze an algorithm’s efficiency, we should obtain an equation that reflects the # of operations the algorithm performs, which is an
expression based on the size of input
3. 12.2 = 12
4. 12.2 = 13
5. 4! = 24
6. log 2 32=5
7. Assume we have a fair eight-sided die with the numbers 1, 2, 3, 3, 4, 4, 5, 5 on its sides. What is the probability that the number 5 will be
rolled? 2/8
8. The order of the function f ( x )=x 2 + xlgx is: x 2
9. Ο(f ) represents the class of functions that grow no faster than f
10. Which of the following expression has the highest order? n3
Quiz 2:
1. Building a tournament tree for a set of 10 elements will take ____ comparisons. 9
2. An algorithm is optimal means there is no algorithm that will work more quickly. True
3. The following is a decision tree of a sorting algorithm, how many comparisons are needed for the worst case for this algorithm
3
4. What is the average number of comparisons for the above sorting algorithm? 2.67
5. Consider square numbers defined as follows (for positive integers):
Square(1) = 1
Square(N) = Square(N-1) + 2N-1
According to this definition, what is Square(3)? Square(3) = Square(2) + 2*3 – 1
6. How many comparisons are needed for the Sequential Search on a list N elements in the worst case? N
7. In worst case, how many comparisons are needed for searching an element from a set of 15 elements with binary search? 4
8. In best case, how many comparisons are needed for searching an element from a set of 15 elements with binary search? 1
9. Which of the following lists is partitioned by the element 15? [16, 18, 15, 4, 6, 11, 8, 7, 10]
10. What does the list [14, 12, 5, 8, 2, 1, 7, 16] look like after the first pass of insertion sort? [12, 14, 5, 8, 2, 1, 7, 16]
Quiz 3:
1. Consider the list [15, 14, 10, 2, 5, 8, 4], assume the increment is 3 for the first pass of shell sort, what would the list be after the first pass? [2,
5, 8, 4, 14, 10, 15]
2. Assume that one uses the formula below to determine the increments for shell sort, what is the value of the increment for the first pass to shell
sort a list of 30 elements?
13
3. Consider the list [15, 14, 10, 2, 5, 8, 4], if using radix sort, the numbers will be distributed into buckets and copied back to form a list, what
does the list look like after the first pass? 10, 2, 14, 4, 15, 5, 8
4. Suppose array a = {15, 12, 9, 10, 6, 7} represents a heap, what is the right child of 12? 6
5. Assume we use merge algorithm to combine two sorted lists [2, 5, 7] and [1, 3, 4] into a one big sorted list, how many comparisons will be
done in this procedure? 4
6. Assume that list [3, 5, 1, 6, 2], what will the list look like after calling PivotList(list, 1, 5), here we use 1 as the minimum index of a list.
[2, 1 , 3 ,6 ,5]
7. What is the factorized result for the polynomial 3 x 3+ 5 x +6 ((3x+0)x+5)x+6
8. How many multiplications are needed to calculate a 2*3 matrix and a 3*5 matrix by standard matrix multiplication algorithm? 30
9. What is the result of A x B, A and B are matrix. A = [122;211] B = [20; 11; 13] [6 8;6 4]
10. What is the size of the matrix to solve 5 linear equations with 5 unknowns? 5∗6
Pumping Lemma & Pushdown Automata
Limits on finite automata
o Finite automata cannot do counting or matching
o We can show that a language is regular by creating a finite automaton that accepts it or a regular grammar that generates it
How to show a language is not regular? If a language has a finite # of words, it must be regular; draw a NFA for each of the individual words;
have edges from the start state to each of these NFA; if an infinite language is regular, then some DFA with N states must accept it; since the
language is infinite, it must have some word w longer than N; it must wind up in one state more than once; denote that state as R
A long word in the regular infinite language the beginning of the word takes the automaton from the starting state to R; the middle part of
the word takes it from R back to R; then end of the word takes it from state R to an accepting state; we write the word as 3 parts w=xyz (X
takes the automaton from the starting state to R, Y takes it from R back to R, Z takes it from state R to an accepting state)
Pumping lemma
j k M −k− j M M −k M
x=a , y=a , z=a b ; step 4: we pump the word down so i=0 ( xz=a b ;since k1, M-kM, so the word is not in the
language)
Pushdown automata like a finite automata but with the addition of storage capability; the storage is in the form of a stack; the transition
function is dependent on the current input symbol, and the symbol on the top of the stack; we use 1 as the initial state, F as the accepting state;
the pushdown automata begins with the special symbol on the stack; whenever is at the top of the stack, the transitions know the stack is
empty; the transition function format (δ ( s , a , b )=( s ' , x); S is the current state, a is the current input symbol, b is the symbol at the top of the
stack; s’ is the new state, x is the symbol(s) that will replace b at the top of the stack)
o Some transition examples
o A pushdown automaton
o Parse tree we can graphically depict the parse of a word by drawing a parse tree; the root node has the starting symbol; any time
that a rule is applied to a nonterminal symbol, the node for that symbol has children labeled with the symbols on the right-hand side of
the rule; when the parse tree has been completed, all of the leaf nodes will be labeled with terminal symbols; the word can be read by
looking at the leaves from left to right; in most cases, you will get the same parse tree no matter in what order the rules are applied;
however, there are grammars that can produce different parse trees based on the order the rules are used—Ambiguous Grammar
o Ambiguous grammar example
Slide array slide the pattern so that the b character of the text lines up with the b character of the pattern; then we begin the matching process
from the right end again; to do this, we need to reset patternLoc to be the size of the pattern, and textLoc has to be increased by 4, which is
what really moves the pattern; need to determine how much to increase textLoc based on the character that didn’t match; use an array called
slide that is as large as the character set that can appear in the text; initialize each element of the slide array to the size of the pattern, because
any characters not in the pattern should move the pattern that amount; if a character appears more than once, the slide value will move the
pattern so the alignment is with the last occurrence
Jump array jump (the same size as our pattern) will encode information about the pattern relative to itself; this new array will be able to let
us know, for example, that when the h and t in Fig don’t match, we need to move the pattern completely past where we currently are
Approximate string matching common problems that might have caused mismatches between the substring and the text (the corresponding
characters in the substring and text are different; the substring has a character the text doesn’t have; the text has a character the substring
doesn’t have); a k-approximate match: k represents the maximum # of differences
o Example
Attempt to match the substring “ad” with the text “read”; the 1st position has 2 possible 2-approximate matches (the a is
changed to an r and the d is changed to an e; there could be an “re” added to the front of the string); the 1 st position has a
possible 3-approximate matches (add an r and change the “ad” to an “ea”); the 2 nd position has a 2-approximate match
(change the “ad” to an “ea”) and a 1-approximate match (add an e to the front)
o Notice that there can be a lot of possibilities and they build very quickly; if the 1 st few characters matched, but then we hit a sequence
that didn’t, we might find a better match if we changed some characters or put some extra characters into the substring, into the text,
or into both
o How can we consider the possibilities and still do this with a reasonable algorithm and data structure? Solve this problem by creating a
matrix that we will call diffs to hold the information that we have gathered so far
Diffs each row of this matrix will be associated with one of the characters in the substring; each column will be associated
with one of the characters in the text; the values in the matrix will give us an idea of how well the matching process is going
at that pont; if the value in row 5 column 27 is a 4, in matching the first five characters of the substring with the portion of the
text ending at location 27, we have found four differences
For any value of diffs[i,j], we will look at the minimum of 3 values
To get this process started, if we refer to any location above the matrix (in other words, i=0), that location will be considered
to have a zero stored in it; if we refer to any location to the left of the matrix (in other words, j=0), that location will be
considered to have the corresponding value of i stored in it
Graph Algorithms
Graph is a formal description for a wide range of situations; a road map: the locations of intersections and roads connecting them
Some types directed graphs, undirected graphs, weighted graphs
o A directed graph
o Weighted graph
Graph background and terminology a graph is an ordered pair, G=(V,E); V: the nodes or vertices of the graph, E: the edges of the graph; if
there is an edge between A and B, traveling from A to B; we use AB to represent the edge between A and B
Undirected graph has edges that can be traversed in either direction; an edge is a set, which contains the labels of the nodes that are at the 2
ends of the edge
Directed graph has edges that can only be traversed in one direction; our set of edges will have ordered pairs in which the first item is where
the edge starts and the second is where the edge ends
Terminology a complete graph is a graph with an edge between every pair of nodes; if there are N nodes, there will be
( N 2−N ) edges in a
2
complete undirected graph; if there are N nodes, there will be ( N 2−N ) edges in a complete directed graph; a subgraph (VS, Es) of a graph or
digraph (V, E) is one that has a subset of the vertices (VS V) and edges (ES E) of the full graph; a path between 2 nodes of a graph or
digraph is a sequence of edges that can be traveled consecutively; we say that a path from node vi to vj is the sequence of edges
v i vi +1 , v i+1 v i+2 , … , v j−1 v j that are in the graph; we require that all of the nodes along this path be unique; a path is said to have a length that
represents the # of edges that make up the path; the path AB, BC, CD, DE has length 4; a weighted graph or digraph is one where each edge
has a value, called the weight, associated with it; we consider the weight to be the cost for traversing the edge; a path through a weighted graph
has a cost that is the sum of the weights of each edge in the path; in a weighted graph, the shortest path between 2 nodes is the path with the
smallest cost, even if it doesn’t have the fewest edges; a graph or digraph is called connected if there is at least one path between every pair of
nodes; a cycle is a path that begins and ends at the same node; an acrylic graph or digraph is one that has no cycles; a graph that is connected
and acyclic is called an unrooted tree; an unrooted tree has the structure of a tree except that no node has been specified as the root but every
node could serve as the root
Ways to store a graph an adjacency matrix, an adjacency list
Adjacency matrix AdjMat, for a graph G=(V,E), with |V|=N, will be stored as a 2 dimensional array of size N*N; each location[i,j] of this
array will store a 0, except if there is an edge from node vi to node vj, the location will store a 1
for weighted graphs and digraphs, the adjacency matrix entries would be
if there is no edge; if there is an edge, the entry is the weight; the diagonal elements would be 0, because there is no cost to travel from a node to
itself
Adjacency list AdjList, for a graph G=(V,E) with |V|=N, will be stored as a one-dimensional array of size N, with each location being a
pointer to a linked list; there will be one list of each node and that list will have one entry for each adjacent node; for weighted graphs and
digraphs, the adjacency list entries would have an additional field to hold the weight for that edge
Traversal algorithm there may be times that we wish to do something to each node in the graph exactly once; for example, there may be a
piece of information that needs to be distributed to all of the computers on a network; we want this information to get to each computer and we
do not want to give it to any computer twice
o 2 traversal algorithms depth-first (our traversal will go as far as possible down a path before considering another); breadth-first (our
traversal will go evenly in many directions); we use the phrase “visit the node” to represent the action that needs to be done at each
node; these methods work with both directed and undirected graphs without any changes; either of these traversal methods can also be
used to determine if a graph is connected
Depth-first traversal we visit the starting node and then proceed to follow links through the graph until we reach a dead
end; in an undirected graph, a node is a dead end if all of the nodes adjacent to it have already been visited; in a directed
graph, if a node has no outgoing edges, we also have a dead end; when we reach a dead end, we back up along our path until
we find an unvisited adjacent node and then continue in that new direction; the process will have completed when we back up
to the starting node and all the nodes adjacent to it have been visited
Choice of next node in illustrating this algorithm and all others in this chapter, if presented with a choice of 2 nodes, we
will choose the node with the numerically or alphabetically smaller label; when this algorithm is implemented, that choice
will depend on how the edges of the graph are stored
Breadth-first traversal we visit the starting node and then on the 1st pass visit all of the nodes directly connected to it; in
the 2nd pass, we visit nodes that are 2 edges “away” from the starting node; with each new pass, we visit nodes that are one
more edge away; it is possible for a node to be on 2 paths of different lengths form the starting node; because we will visit
that node for the 1st time along the shortest path from the starting node, we will not need to consider it again; need to keep a
list of the nodes we have visited
o Traversal analysis our goal for these 2 traversal algorithms was to create a process that would visit each node of a connected graph
exactly once; the work done to check to see if an adjacent node has been visited and the work to traverse the edges is not significant in
this case; so the order of the algorithm is the # of times a node is visited; these traversals are therefore of order O(N)
o