You are on page 1of 62

ENGRECE 235: Design and Analysis of Algorithms

Course Code: 15845


Quarter: Fall 2003
Professor Pai H. Chou
Electrical Engineering and Computer Science
September 26, 2003
Contents
1 Introduction to Analysis and Design of Algorithms 5
1.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Whats an algorithm? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Classes of algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.3 Example: sorting problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.4 Many sorting algorithms available: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Execution model assumption: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Algorithm runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Algorithm design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Approaches: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Example: merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Problem complexity vs. Algorithm Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Growth of Functions, Data structures 8
2.1 Asymptotic Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.1 big O = asymptotic upper bound: (page 44) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.2 big = asymptotic lower bound (page 45) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 big = asymptitic tight bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.4 little-o, little- not asymptotically tight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Polynomials, log, factorial, bonacci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Polynomial in n of degree d: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3 Factorial: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.4 Fibonacci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.5 misc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Python: Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Data Structures Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4.1 Array implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4.2 Stack and Calls/Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4.3 Linked List x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.4 Binary search tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.5 rooted trees w/ unbounded branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Recurrences; QuickSort 11
3.1 Recurrence: by Substitution method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Recursion tree, convert to summation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 The Master Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 QuickSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.1 Worst case: uneven partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.2 Best case: balanced partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.3 Average case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1
4 Heap sort, Priority Queues 15
4.1 Heap data structure: Binary heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Heapify: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 BuildHeap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3.1 Complexity: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3.2 HeapSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 Python Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.5 Priority Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.5.1 Heap Insert: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.5.2 Extract max: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.5.3 Heap Increase key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Linear Time sorting, Medians/Order Stat 18
5.1 Comparison-based sorting: prob. complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 Non-comparison sorts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2.1 Counting sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2.2 RadixSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2.3 Bucket sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.3 Median and Order statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3.1 Find min? max? both min and max? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3.2 How to nd the i-th smallest w/out sorting? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3.3 Can show that worst-case select is O(n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3.4 Justication: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6 Binary Search Trees and Red-Black Trees 22
6.1 Binary Search Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.1.1 API for BST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2 Python Programming with Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2.1 Node data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2.2 Tuple-encoded Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.3 Red-Black Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.3.1 Five Properties: (important!!) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.3.2 How to maintain the Red-Black Tree properties? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.3.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.3.4 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.3.5 Tree data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7 Dynamic Programming 27
7.1 Optimization problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.1.1 Dynamic programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.2 Case study: Assembly line scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.3 Matrix-Chain multiply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.3.1 optimial parenthesization to inimize # scalar mults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.4 Longest common subsequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.5 Optimal Binary Search Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
8 Greedy Algorithm 31
8.1 Activity Selection problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.2 Greedy choice property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.3 Huffman codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
9 Basic Graph Algorithm 35
9.1 Graphs G = (V, E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
9.2 Breadth-rst search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
9.3 Depth-rst search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
9.4 Topological sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
9.5 Strongly Connected Components (SCC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
9.6 Python Programming: How to represent a graph!? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
9.6.1 Simple: Directed graph, no weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
9.6.2 Weighted graph, use dictionary of dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2
9.6.3 Wrap data structure in a class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
10 Minimum Spanning Trees 40
10.1 Min-spanning trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
10.2 Kruskals algorithm: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
10.2.1 Runtime analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
10.3 Prims algorithm: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
11 Single Src Shortest Paths 43
11.1 DAG shortest paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
11.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
11.2 Dijkstras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
11.2.1 Algorithm: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
11.2.2 Example: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
11.2.3 Correctness proof sketch: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
11.2.4 What about negative weights? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
11.3 Bellman-Ford . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
11.3.1 Algorithm: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
11.3.2 Correctness: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
11.3.3 Complexity: O(VE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
11.4 Bellman-Ford: special case of linear programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
11.4.1 Bellman Ford as a Constraint Solver! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
12 All Pairs Shortest Paths 47
12.1 Special case Linear Programming (contd) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
12.1.1 Correctness sketch: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
12.2 All Pairs shortest path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
12.2.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
12.2.2 Repackage this as a matrix structure: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
12.2.3 Floyd-Warshall: O(V
3
) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
12.2.4 Algorithm - very simple! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
12.3 Transitive closure: E

(Warshalls) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
12.4 Sparse graphs: Johnsons algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
12.4.1 Proof weight works: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
13 Flow Networks 50
13.1 Flow network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
13.1.1 Maxow methods: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
13.2 FORD-FULKERSON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
13.3 EDMONDS-KARP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
13.4 Maximal bipartite matching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
14 NP-Completeness 53
14.1 Other Graph problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
14.1.1 Euler Tour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
14.1.2 Hamiltonian cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
14.1.3 Traveling Salesman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
14.1.4 Constructive vs. Decision vs. Verication Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
14.2 Problem complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
14.2.1 Formal Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
14.2.2 Decision problem in Formal language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
14.2.3 Decision in polynomial time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
14.2.4 P, NP, co-NP in terms of formal languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
14.3 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
14.4 NP, NP-complete, NP-hard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
14.4.1 How to show L NP-Complete? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
14.4.2 Examples Circuit Satisability: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
14.4.3 NP completeness proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
14.5 NPC Proof for (Formula) Satisability problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
14.5.1 3-CNF satisability (3-SAT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3
14.6 NPC proof contd: k-clique problem (CLIQUE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
15 Approx. Algorithms 58
15.1 k-VERTEX-COVER problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
15.1.1 Observation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
15.1.2 Approximation algorithm for Vertex Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
15.2 Hamiltonian (HAM-CYCLE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
15.3 Traveling Salesman Problem (TSP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
15.3.1 Approximated TSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
15.3.2 General TSP: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
15.4 Subset-sum problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
15.4.1 Exact solution: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
15.4.2 NP Complete proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
15.4.3 polynomial-time approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
15.5 Set Covering problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
15.5.1 Greedy approximation for SET-COVER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4
Lecture 1
Introduction to Analysis and Design of Algorithms
Administrative
Course syllabus: http://e3.uci.edu/03f/15845/syllabus.html
1.1 Algorithm
1.1.1 Whats an algorithm?
input, output problem statement
sequence of steps
data structures
must always halt and output must satisfy property
1.1.2 Classes of algorithms
sorting (classied by problem)
tree traversal, graph algorithms (classied by data struc-
tures / representation)
dynamic programming (classied by technique)
incremental, greedy, divide-&-conquer algorithm (clas-
sied by decision criteria)
Why this class?
fundamental cross cutting applications
analysis aspect: problem complexity, what problems are
easy/hard, how to quantify
many solutions to the same problem
many applications of a given solution
1.1.3 Example: sorting problem
input: array A = [a
1
, a
2
, . . . a
n
];
output: array A
/
= [a
/
1
, a
/
2
, . . . , a
/
n
] such that a
/
1
a
/
2

. . . a
/
n
1.1.4 Many sorting algorithms available:
insertion sort
See Chapter 2 Sec. 2.1 for explanation; We will revisit sorting
later.
idea: partition A into two: [1. . . ( j 1)] already nonde-
creasing, [ j . . . n] not yet sorted.
grow this sorted subarray by one more element
insert it into the tail of the sorted one, but need to make
it sorted
shift everybody up by one until the nondecreasing
property is satised again
done when j 1 = n
5 2 4 6 1 3 input A
5 2 4 6 1 3
sorted
A[1..j-1]
not sorted
A[j .. n]
A[j]
5 4 6 1 3
new A[j]
j = 2
j = 3
2 5 6 1 3
2
4 j = 4
new A[j]
j =5
2 5 6 4 1 3
new A[j]
j =6
2 5 6 4 3
new A[j]
1
Output 2 5 6 4 1 3
5
1.2 Analysis
1.2.1 Execution model assumption:
RAM (random access machine) as opposed to stack,
queue, or FSM!
sequential instruction execution
alternative assumptions: parallelism, communication,
synchronization
associative memory (content-addressable)
data structures pointers/linked list instead of copying
1.2.2 Algorithm runtime
measure it, but want a formula T(n) where n is problem
size
want to factor out processor speed as a scaling factor
worst case, best case, average case
algorithm memory consumption (visit later)
Chapter 3 deals with Order of growth (n
2
)
Insertion sort: O(n
2
)
Quick sort: O(n
2
) worst case, O(nlgn) avg.
Merge sort, Heap Sort: (nlgn)
1.3 Algorithm design
1.3.1 Approaches:
incremental
divide-and-conquer (e.g., MergeSort, QuickSort)
greedy (e.g., Min-Spanning Tree)
backtracking
randomization
genetic
1.3.2 Example: merge sort
divide array into two halves
sort left and sort right
interleave the two halves
uses Recursion
(nlgn) complexity, where lg = log base 2
but uses at least twice as much memory in order to keep
merge linear
5 2 4 6 1 3 input A[1:6]
5 2 4
MergeSort
A[1:3]
6 1 3
MergeSort
A[4:6]
(1)
(2)
MergeSort
A[1:2]
(3)
5 2
(4)
4
5
MergeSort
A[1:1]
(5)
MergeSort
A[2:2]
2
(6)
Merge
(A[1:1],
A[2:2])
2 5
MergeSort
A[3:3]
(7)
4
(8)
Merge
(A[1:2],
A[3:3])
2 4 5 1 3 6
(9)
1 2 3 4 5 6
Merge
(A[4:5],
A[6:6])
(15)
(16)
Merge(A[1:3], A[4:6])
output A[1:6]
1.4 Problem complexity vs. Algorithm Com-
plexity
Measures how hard the problem is
Bounds the complexity of all algorithms for this problem
Example: Cant do better than O(nlgn) for comparision-
based sorting
But can do better (linear time) using non-comparison-
based algo.
The problem itself may be hard! No efcient algorithm
exists for those problems.
NP-Complete problems:
Guess an answer, verify (yes/no) in polynomial time;
otherwise exponential or worse.
Approximation algorithms: not minimum/optimal cost,
but comes within a factor.
1.5 Data Structures
xed: arrays, matrices
growable: stacks, queues, linked list, tree, heap
key addressable: hash table
structures: graphs, ow networks
1.6 Python
Scripting language, free!
Already installed on unix; download for any platform
Easy, powerful, interactive
Best way is to try it out; helps you debug algorithm
Tutorial at http://www.python.org/doc/current/tut/tut.html
6
Important features
No Semicolons, no C-style { }
Indentation is signicant
Loosely typed
# starts a comment, unless inside string
Interactive mode
% python
>>> 2 + 3
5
>>> x = (2 + 3) * 4
>>> x
20
>>> print "hello world"
hello world
Lists and Arrays
>>> x = [1, 2, 3]
>>> x
[1, 2, 3]
>>> x[0]
1
>>> x.append(20)
[1, 2, 3, 20]
>>> len(x) # get the number of elements
4
>>> x.pop() # pop from tail like a stack
20
>>> x
[1, 2, 3]
>>> x.pop(0) # pop from head like a queue
1
>>> x
[2, 3]
You can have other data types in lists and mix them:
>>> dayofweek = [Sun,Mon,Tue,Wed]
>>> dayofweek[3]
Wed
Hash Table (Dictionary)
Q: What if we want to look up the number from the name?
Hash tables let you use the name string as an index. How-
ever, one big difference is that lists are ordered, whereas hash
tables are not.
>>> d = {} # make an empty hash table
>>> d[Sun] = 0
>>> d[Mon] = 1
>>> d[Tue] = 2
>>> d[Wed] = 3
>>> d
{Wed: 3, Tue: 2, Mon: 1, Sun: 0}
>>> d[Tue]
2
>>> d.has_key(Mon)
1 # this means true
>>> d.has_key(Blah blah)
0 # this means false
>>> d.keys()
[Wed, Tue, Mon, Sun]
7
Lecture 2
Growth of Functions, Data structures
This lecture:
big O, big , and big , how they grow
fundamental data structures
2.1 Asymptotic Notations
purpose: express run time as T(n), n = input size
assumption: n N =0, 1, 2, . . . natural numbers
2.1.1 big O = asymptotic upper bound: (page 44)
f (n) = O(g(n)) is a set of functions that are upper-
bounded by 0 f (n) c g(n) for some n n
0
f (n) grows no faster than g(n)
example: 2n
2
= O(n
3
), for c = 1, n
0
2
funny equality, more like a membership: 2n
2
O(n
3
)
n
2
and n
3
are actually functions, not values.
sloppy but convenient
2.1.2 big = asymptotic lower bound (page 45)
swap the place between f (n) and c g(n):
0 c g(n) f (n) for some n n
0
example:

n = (lgn), for c = 1, n
0
= 16
2.1.3 big = asymptitic tight bounds
need two constants c
1
and c
2
to sandwich f (n).
example:
1
2
n
2
2n = (n
2
) for c
1
=
1
4
, c
2
=
1
2
, n
0
= 8
Theorem: (O and )
g(n)
f(n)
O(g(n))
0
g(n)
(g(n))
f(n)
0 0
g(n)
g(n)
f(n)
(g(n))
Example: insertion sort
O(n
2
) worst case, (n) best case
T(n) = c
1
n // for loop
+ c
2
(n1) // key A
+
n
j=2
Insert( j) // loop to insert j
+ c
8
(n1) // A[i +1] key
Best case of Insert( j) is (1), Worst is ( j)
Issue: (n
2
) +(n) is still (n
2
)
Example: MergeSort(A, p, r)
where p, r are the lower and upper bound indexes to array A.
if p < r then
q :=(p+r)/2|
MergeSort(A, p, q), MergeSort(A, q+1, r)
Merge(A, p, q, r)
T(n) =
_
(1) if n = 1
2 T(
n
2
) +(n) if n > 1
T(n) is expressed in terms of T itself! This is called recur-
rence. We want a closed-form solution (no T on the right-hand-
side of the equation).
To solve recurrence:
rewrite as T(n) =
_
c if n = 1
2 T(
n
2
) +c n if n > 1

T(n) = 2T(
n
2
) +cn Expand
= 2(2T(
n/2
2
) +c
n
2
) +cn
= 4T(
n
4
) +2c
n
2
+cn = 4T(
n
4
) +2cn
= 8T(
n
8
) +3cn = . . .
= (2
i
)T(
n
2
i
) +i cn
We can divide n by 2 at most lgn times before n = 1. So,
T(n) = (lgn) cn+cn = (nlgn).
Chapter 4 will showhowto solve recurrence systematically.
2.1.4 little-o, little- not asymptotically tight
o(g(n)) = f (n) : c > 0n
0
> 0 : 0 f (n) < cg(n) n n
0

(g(n)) = f (n) : c > 0n


0
> 0 : 0 cg(n) < f (n) n n
0

8
Example
2n
2
= O(n
2
) is a tight bound 2n
2
,= o(n
2
).
2n = O(n
2
) is not tight 2n = o(n
2
).
does not make sense to have little-
2.2 Polynomials, log, factorial, bonacci
2.2.1 Polynomial in n of degree d:
p(n) =
d
i=0
a
i
n
i
= (n
d
)
Polynomially bounded: f (n) = O(n
k
) for constant k.
2.2.2 logarithm
useful identity: a
log
b
c
= c
log
b
a
.
polylogarithmically bounded if f (n) = O(lg
k
n) for con-
stant k
grows slower than any polynimial: lg
b
n = o(n
a
) for any
constant a > 0.
log-star: lg

n = mini 0 : lg
(i)
n 1
(log of log of log ... of log of n) grows very very slowly.
Example: lg

(2
65536
) = 5.
2.2.3 Factorial:
Loose bound: n! is faster than 2
n
but slower than n
n
precisely, n! = (n
n+1/2
e
lgn
) (Stirlings approxima-
tion)
lg(n!) = (nlgn)
2.2.4 Fibonacci
grows exponentially
2.2.5 misc
2
2
n
grows faster than n!
(n+1)! grows faster than n!
n! grows faster than n 2
n
faster than 2
n
2
lgn
= n
2.3 Python: Functions
function denition: def functionName (paramList) :
variables have local scope by default
control ow: for, while, if
range(a, b) function: returns the list [a, ... b1]
>>> range(1,5)
[1, 2, 3, 4]
Algorithm from book (p.24) Python code (le isort.py)
INSERTION-SORT(A) def InsertionSort(A):
1 for j 2 to length[A] for j in range(1, len(A)):
2 do key A[ j] key = A[j]
3 i j 1 i = j 1
4 while i > 0 and A[i] > key while (i>= 0) and (A[i]>key):
5 do A[i +1] A[i] A[i+1] = A[i]
6 i i 1 i = i 1
7 A[i +1] key A[i+1] = key
Dont forget the colon : for def, for, and while constructs
Book A[1. . . n] Python A[0. . . n1]
Books 2. . . n Python 1. . . n1.
line 1: range(1,len(A))[1...len(A)-1]
% python
>>> from isort import * # read le isort.py
>>> x = [2,7,3,8,1] # create test case
>>> InsertionSort(x) # call routine
>>> x # look at result
[1, 2, 3, 7, 8]
Works for other data types, too!
2.4 Data Structures Review
Concepts:
Stacks: Last-in First-out
Queue: First-in First-out
Trees: Root, children
2.4.1 Array implementation
Stack S: need a stack pointer (top)
Push(x): S[++top] x
Pop(): return S[top--]
Note: Actually stacks can grow downward also.
Note: To be safe, check stack underow/overow.
Queue Q: need 2 pointers (head, tail)
Enqueue(x): Q[tail] x
tail (tail mod len(Q)) + 1
Dequeue(): temp Q[head]
head (head mod len(Q)) + 1
return temp
Note: To be safe, check queue underow/overow.
2.4.2 Stack and Calls/Recursion
Calling routine Push local, param, return address
Return Pop and continue
9
Example: MergeSort (cf. MergeSort Fig. from last lecture)
L (Left), R (Right), M (Merge)
(1) (2) (3) (4) (5) (6) (7) (8)
[1, 6] [1, 6]L
[1, 3]
[1, 6]L
[1, 3]L
[1, 2]
[1, 6]L
[1, 3]L
[1, 2]L
[1, 1]
[1, 6]L
[1, 3]L
[1, 2]R
[1, 6]L
[1, 3]L
[1, 2]R
[2, 2]
[1, 6]L
[1, 3]L
[1, 2]M
[1, 6]L
[1, 3]R
[1, 6]L
[1, 3]R
[3, 3]
[1, 6]L
[1, 3]M
[1, 6]R
2.4.3 Linked List x
Singly linked (next) or Doubly linked (next, prev)
need a key
need a NIL value
How to implement linked lists?
Pre-allocate arrays prev[1 : n], next[1 : n], key[1 : n], and
x is an index 0. . . n, where 0 is NIL.
Records (structures in C, classes in Java): x.prev,
x.next, x.key, and NULL for NIL.
Rearrange by adjusting links, not moving data (key)
Recall: INSERTIONSORT with array need to copy object
to insert.
Linked list representation: once we know where to in-
sert, adjust link in O(1) time.
Array version: best case O(1), worst case O(n), average
case O(
n
2
) = O(n).
2 5 6 4 1
2 5 6 4 1 6
2 5 4 1 6 5
2 5 4 1 6 4
2 5 3 1 6 4
2 5 6 4 1
insert(3)
3
2 5 6 4 1
(1) alllocate a
node for the
data
head
insert(3)
(2) adjust links
3
Search for the item
e.g., where to insert?
list is unsorted, worst/average case O(n) time to nd it!
(compare keys one at a time).
list is sorted:
Array: binary search in O(lgn) times
Singly linked list: O(n) avg/worst case, need to do
linear search
Doubly linked list: does not help! still O(n) cant
jump to the middle of the list, because link is to the
head/tail.
2.4.4 Binary search tree
pointer to root; each node has pointer to left-child, right-
child
optional pointer to parent
left smaller, right bigger key value
O(lgn) search, if BALANCED
Can still be O(n) if not balanced!!
Red-black tree (Ch.13) is a way to keep BST balanced
a Heap (Ch.6) implements balanced BST using array; no
pointers
18
2 6 13 4 1
head tail
18 20
6
root
Doubly
Linked
List
Binary
(Search)
Tree
2
1 4 13 20
n-ary tree
w/ left-child,
right-sibling
18
6
root
2
1 4 13 20
2.4.5 rooted trees w/ unbounded branches
Could have arbitrary number of children!
left-child: head of linked list to children (down one level)
right-sibling: link to next on same level
optional link back to parent
10
Lecture 3
Recurrences; QuickSort
This lecture:
Chapter 4: Recurrences, Master method
Substitution method
Recursion tree
Master method: T(n) = aT(n/b) + f (n)
Quicksort (Ch. 7.17.2) (if we have time today)
Summation review (Appendix A)
3.1 Recurrence: by Substitution method
guess-and-verify
rename variables if necessary.
mathematical induction: assume true for n
0
, show true
for n n
0
.
issue: boundary conditions, oor/ceiling use inequal-
ity to bound those cases.
avoid pitfall with constant terms.
Example w/ oor
T(n) =
_
(1) if n = 1,
2T(n/2|) +n if n > 1.
Guess T(n) = O(nlgn).
sufcient to show T(n) cnlgn for some c > 0.
Assume T(n/2|) cn/2|lg(n/2|).
T(n) 2(cn/2|lg(n/2|)) +n substitution
cnlg(n/2) +n drop oor
= cnlgncnlg2+n (lg(a/b) = lgalgb)
= cnlgncn+n (lg2 = 1)
cnlgn c 1
.
Avoid Pitfalls
What if we guessed T(n) = O(n)??
Try to show T(n) cn
T(n) 2(cn/2|) +n subst. into T(n) 2T(n/2|) +n
cn+n drop oor
= (c +1)n factor out const
,cn Fails!
.
must be careful not to drop constants arbitrarily.
subtractive term in guessed solution
Example: T(n) = T(n/2|) +T(n/2|) +1
Guess: O(n). First attempt: verify T(n) cn:
T(n) cn/2| +cn/2| +1
= cn+1 ,cn
Oops! But this does not mean T(n) ,= O(n).
works if we guess T(n) cnb for some b 0:
T(n) (cn/2| b) +(cn/2| b) +1
= cn2b+1
cnb works for b 1.
11
Variable Renaming
introduce a new variable to replace a difcult expression, make
it easier to solve
Example T(n) = 2T(

n|) +lgn
Rename: let m = lgn: T(n = 2
lgn
= 2
m
) = 2T(2
m
2 ) +m
Rename: let S(m) = T(2
m
) S(m) = 2S(
m
2
) +m
This looks like S(m) = O(mlgm)
Substitute back, lgn = m: T(2
m
= 2
lgn
= n) = S(m)
= O(mlgm) = O((lgn)lg(lgn)).
3.2 Recursion tree, convert to summation
Example: T(n) = 3T(n/4|) +(n
2
).
Visualize as recursion tree: T(n) = 3T(n/4) +cn
2
for
some c > 0
cn
2
T(n/4) T(n/4)
T(n/4) = c(n/4)
2
T(n/16) T(n/16) T(n/16)
cn
2
(3/16) cn
2
T(1) T(1) T(1)
Convert to summation:
T(n) = cn
2
+
3
16
cn
2
+(
3
16
)
2
cn
2
+ +(
3
16
)
Levels1
+width
# Levels = 1 +log
4
n, similar argument to MergeSort.
root is 1 level, divide log
4
n times.
width at level i is 3
i
, so at the bottom level, width is
3
log
4
n
.
Apply the identity a
log
b
c
= c
log
b
a
, rewrite 3
log
4
n
=
n
log
4
3
.
T(n) =
(log
4
n)1

i=0
(
3
16
)
i
cn
2
+(n
log
4
3
)
<

i=0
(
3
16
)
i
cn
2
+(n
log
4
3
)
apply geometric series

k=0
x
k
=
1
1x
=
1
1(3/16)
cn
2
+(n
log
4
3
)
=
16
13
cn
2
+(n
<1
)
= O(n
2
).
Can also show T(n) = (n
2
) and therefore T(n) = (n
2
).
Unbalanced Recursion Tree
Example: T(n) = T(n/3) +T(2n/3) +O(n)
Root level: cn
Next level: c
n
3
+c
2n
3
= cn
Tree height: How many times can we divide n into
1
3
and
2
3
?
log
3
n times for
1
3
(most shallow branch), and
log
3/2
n times for
2
3
division (deepest branch).
# nodes at level i = 2
i
(binary tree)
# nodes at bottom level < 2
log
3/2
n
= n
log
3/2
2
(not a full binary tree)
Note log grows slower than any polynomial, so
n
log
3/2
2
= n
log
1.5
2
= n
>1
= (nlgn).
Total cost = cn #levels +#leaves
= O(cnlog
3/2
n) +(nlgn) = O(nlgn).
Can use substitution method to verify upper bound:
T(n) dnlgn: (see book for proof)
3.3 The Master Method
General form: T(n) = aT(n/b) + f (n), a 1, b > 1
intuition: divide-and-conquer
a-way branch (recursive calls) per level,
each subproblem size = n/b,
Combine cost is f (n).
#levels = log
b
n
#nodes(at level i) = a
i
#nodes(at leaf level) = a
log
b
n
= n
log
b
a
Example: MERGESORT
a = 2, b = 2, f (n) = (n) oor/ceiling dont matter
12
Three cases, compare f (n) to n
log
b
a
if f (n) = then T(n) =
1. O(n
(log
b
a)
) (n
log
b
a
)
2. (n
log
b
a
) (n
log
b
a
lgn).
3. (n
(log
b
a+
) ( f (n)).
and af (n/b) c f (n)
for const c < 1, n n
0
,
Restated:
1. if f (n) is polynomially slower than #leaves = n
(log
b
a)
,
then
# leaves dominate cost; its (a
log
b
n
) = (n
log
b
a
)
2. f (n) (conquer) and n
log
b
a
(divide) are equally fast,
(with lgn levels) then its (n
log
b
a
lgn).
3. The conquer cost f (n) is dominant.
Important note:
Master method does not cover all recurrences!
f (n) and n
log
b
a
comparison must be differ polynomially
(i.e., n

); smaller differences dont apply.


Actually, a more general result for case 2:
if
f (n)
n
log
b
a
= (lg
k
n) for constant k 0
then T(n) = (n
log
b
a
lg
k+1
n)
Examples
T(n) = 9T(n/3) +n
f (n) = n, a = 9, b = 3, n
log
b
a
= n
log
3
9
= (n
2
)
case 1, because f (n) = O(n
log
3
9
) = O(n
21
),
within a polynomial factor
T(n) = n
log
b
a
= (n
2
).
T(n) = T(2n/3) +1
a = 1, b = 3/2, f (n) = 1, n
log
b
a
= n
log
3/2
1
= n
0
= 1
case 2, because f (n) = 1 = (n
log
b
a
).
T(n) = (n
0
lgn) = (lgn)
T(n) = 3T(n/4) +nlgn
a = 3, b = 4, f (n) = nlgn, n
log
b
a
= n
log
4
3
case 3: because f (n) = nlgn = (n
log
4
3+
)
T(n) = ( f (n)) = (nlgn)
3.4 QuickSort
Divide and Conquer:
pick a pivot to serve as the dividing point
partition array A around the pivot: one side with ele-
ments smaller than pivot, one side larger
recursively sort two sublists
no need for merge!
2 8 7 1 3 5 6 4
Input
A[p, r]
pick pivot x = 4 (from A[r])
4
after PARTITION, want to get
x x > x
in this case:
4 2, 1, 3 7, 5, 6, 8
QUICKSORT
recursively
QUICKSORT
recursively
QUICKSORT(A, p, r)
1 if p < r
2 then q PARTITION(A, p, r)
3 QUICKSORT(A, p, q1)
4 QUICKSORT(A, q+1, r)
Algorithm from book (p.146) Python code
PARTITION(A, p, r) def Partition(A,p,r):
1 x A[r] x = A[r]
2 i p1 i = p - 1
3 for j p to r 1 for j in range(p,r):
4 do if A[ j] x if (A[j]<= x):
5 then i i +1 i = i + 1
6 exchange A[i] A[ j] A[i],A[j] = A[j],A[i]
7 exchange A[i +1] A[ j] A[i+1],A[r] = A[r],A[i+1]
8 return i +1 return i + 1
3.4.1 Worst case: uneven partitioning
Intuition:
T(n) = T(n1) +T(0) +(n)
= T(n1) +(n)
This is arithmetic series (triangle):

n
k=1
k =
1
2
n(n+1) = (n
2
).
Happens when list already sorted!
Prove more strictly:
13

T(n) = max
0qn1
(T(q) +T(nq1)) +(n)
max
0qn1
(cq
2
+c(nq1)
2
) +(n)
= c max
0qn1
(q
2
+(nq1)
2
) +(n)
c(n1)
2
+(n) = c(n
2
2n+1) +(n)
= cn
2
c(2n1) +(n)
cn
2
3.4.2 Best case: balanced partitioning
T(n) 2T(n/2) +(n)
Can apply 2nd case of the Master theorem:
if f (n) = (n
log
b
a
), then T(n) = (n
log
b
a
lgn).
T(n) = (n
log
2
2=1
lgn) = (nlgn)
3.4.3 Average case
(next time)
14
Lecture 4
Heap sort, Priority Queues
This lecture:
Quick sort (continued from last time)
Heap property, Heap sort
Python implementation (for homework)
4.1 Heap data structure: Binary heap
nearly balanced binary tree
use array instead of pointer data structure
Denition:
parent(i) =i/2|
leftChild(i) = 2i
rightChild(i) = 2i +1
heap property
by default: Max Heap: A[parent(i)] A[i]
Min Heap: A[parent(i)] A[i]
height of heap as a binary tree is (lgn).
x
x
(max) heap
zs Left and R
both z
x
y z
x
y z
ys L and R
both y
1
2, 3
4, 5, 6, 7
y, z both x
index
Examples: Are these (max)-heaps?
[7, 5, 6, 2, 3, 4]?
(a) 7 5, 6; (b) 5 2, 3; (c) 6 4 Yes.
[7, 6, 5, 4, 3, 2]?
(a) 7 6, 5; (b) 6 4, 3; (c) 5 2 Yes.
[2, 5, 7, 4, 3, 6]?
No. However, it is almost a heap if we dont consider
the root: (b) 5 4, 3; (c) 7 6.
This can be xed with a call to HEAPIFY.
Heap API
HEAPIFY(A, i) (incremental x) assume L and R subtrees
of i are heaps, make A[i . . . n] a heap.
BUILDHEAP(A) input unsorted array, rearrange to make it a heap
HEAPSORT(A) sorting alg., uses HEAPIFY and BUILDHEAP
4.2 Heapify:
Assumption: called when L(i), R(i) are heaps, but
A[i] itself isnt its children
after call: propagate the largest of children up to root
position, so that the whole thing is a heap again.
MAX-HEAPIFY(A, i, n) (see page 130, slightly different)
1 l LEFT(i) index for left child
2 r RIGHT(i) index for right child
3 if (l n) and (A[l] > A[i])
4 then largest l left child exists and is bigger
5 else largest i either no L or root is bigger
6 if (r n) and (A[r] > A[largest])
7 then largest r right child exists and is bigger
no else clause: either no R or root is bigger.
already taken care of by line 5
8 if largest ,= i
9 then exchange A[i] A[largest]
10 MAX-HEAPIFY(A, largest, n)
2
5 7
4 3 6
index
1
2, 3
4 7
largest
7
5 2
4 3 6
largest
7
5
2 4 3
6
done
Correctness: by induction.
15
Time complexity of Heapify
O(lgn) binary tree argument.
Or, O(h) where h = height of the heap as a binary tree.
4.3 BuildHeap
input unsorted array, output a heap after the call
leaf = trivial heap, start with one level above the leaf
BUILD-MAX-HEAP(A) (page 133)
1 n length[A]
2 for i n/2| downto 1
3 do MAX-HEAPIFY(A, i, n)
7
2
3 4
5 6 7
index
1
2, 3
4 7
7
6
2 5 3
4
done
HEAPIFY
2
3
5 6 4
2
6 7
5 3 4
HEAPIFY
2
3 4
5 6 7
trivial heaps
1
2, 3
4 7
HEAPIFY
Build (Max) Heap:
7
6
5 3
2
4
largest
largest
Correctness: assumption for Heapify always hold.
4.3.1 Complexity:
loose upper bound is O(
n
2
lgn) = O(nlgn)
MAX-HEAPIFY is O(lgn) iterate n/2 times.
However! tighter bound is O(n).
HEAPIFY near leaf level is much cheaper than near root
level!
Height of n-element heap =lgn|. (binary tree)
# nodes at level h is n/2
h+1
|
Height 0 (leaf) n/2
0+1
| =n/2| nodes.
One level higher half as many nodes
HEAPIFY called on a node at height h is O(h) (see last
page if you forgot)

T(n) =
lgn

h=0
(#of HEAPIFY calls at level h)
Time of HEAPIFY at level h
=
lgn

h=0
(
n
2
h+1
)O(h), factor out n, mult. by 2,
= O(n
lgn

h=0
h
2
h
).
Apply the integrating and differentiating series (Ap-
pendix A)

k=0
kx
k
=
x
(1x)
2
.
plug in x =
1
2
, we get

h=0
(
h
2
h
) =
(1/2)
(11/2)
2
= 2.
T(n) = O(n 2) = O(n)
4.3.2 HeapSort
Use HEAPIFY to nd the largest, stick it in the end A[n]
Use HEAPIFY on [1. . . n 1] to nd the second largest,
stick it in position A[n1]
HEAPSORT(A) (page 136, slightly different)
1 BUILD-MAX-HEAP(A, n) O(n) time
2 for i n downto 2 n1 times
3 do exchange A[1] A[i] O(1) time
4 MAX-HEAPIFY(A, 1, i 1) O(lg(i 1)) time
Total time:
O(n)BUILDHEAP
+ O(n)loop O(lgn)HEAPIFY
= O(nlgn)
7 6 2 5 3 4
input: [2, 3, 4, 5, 6, 7]
Line 1: BUILDMAXHEAP
Line 2: loop, i = 6
heap
Line 3: swap, A[1], A[i]
7 6 2 5 3 4
sorted almost a heap
(root is wrong, but
children are OK)
Line 4: Heapify(A, 1, 5)
7 5 6 2 3 4
7 5 3 2 6 4
heap again sorted
Line 3: swap, A[1], A[i]
Line 2: loop, i = 5
sorted almost a heap
Line 4: Heapify(A, 1, 4)
7 3 5 2 6 4
Line 2: loop, i = 4
heap again
sorted
Continue until the whole array is sorted.
4.4 Python Programming
See http://e3.uci.edu/01f/15545/heap-hint.html; or for PDF, try
http://e3.uci.edu/01f/15545/heap-hint.pdf
4.5 Priority Queues
want to get the highest priority element,
MAXHEAPINSERT(A, x)
EXTRACTMAX(A)
16
Implementation choices:
unsorted: O(n) to extract, O(1) to insert (anywhere)
sorted linked list: O(1) to extract, O(n) to insert
Heap: O(lgn) to insert , O(lgn) to extract
4.5.1 Heap Insert:
assume A is already a heap
increase size of A by one, put newelement at the end, and
propagate up until its parent is larger (or its the largest)
MAX-HEAP-INSERT(A, key) (diff. from book page 140)
1 heap-size[A] heap-size[A] +1
2 i heap-size[A]
3 while i > 1 and A[Parent(i)] < key
4 A[i] A[Parent(i)]
5 i Parent(i)
6 A[i] key
7
7
6
5 3 2
key = 8
7
6
5 3 2
key = 8 7
6 4
5 3 2
make room
MaxHeapInsert(8)
4 4
4 7
8
6
5 3 2
key = 8
4
O(lgn) time (climb up from tail to at most root)
assume short circuit evaluation (i.e., check index, if i ,>1
then A[parent(i)] is not even evaluated!
Even though the algorithm says
heap-size[A] heap-size[A] +1,
you probably cant do that in any programming lan-
guage. You need to actually make room for A[n +1], or
else your program can crash. In Python, call A.append
to increase physical size of A.
in Python, heap-size[A] can be derived from len(A),
rather than a global variable.
4.5.2 Extract max:
pull the max (already at the root)
move the last one to the front and heapify (since both
children are heaps)
in Python, use A.pop to accomplish heap-size[A]
heap-size[A] 1.
HEAP-EXTRACTMAX(A)
1 if (heap-size[A] < 1)
2 then error heap underow;
3 max A[1]
4 A[1] A[heap-size[A]]
5 heap-size[A] heap-size[A] 1
6 MAX-HEAPIFY(A, 1)
7 return max
4.5.3 Heap Increase key
HEAP-INCREASE-KEY(A, i, key)
A is a heap, i is an index to an element.
Purpose: overwrite element A[i] with key (which is ex-
pected to be larger than A[i] hence increase key),
while making minor changes to restore its heap property.
Strategy: propagate A[i] up towards the root, similar to
MAX-HEAP-INSERT.
HEAP-INCREASE-KEY(A, i, key) (page 140)
1 if key < A[i]
2 then error new key is smaller than current key
3 A[i] key
4 while i > 1 and A[Parent(i)] < A[i]
5 do exchange A[i] A[Parent(i)]
6 i Parent(i)
not a common routine in priority queues; used later in
min-spanning tree algorithms
very unusual to implement MAX-HEAP-INSERT as
shown on page 140.
implies a max heap. (HEAP-DECREASE-KEY would im-
ply a min heap.)
timing complexity similar to HEAPINSERT except no re-
allocation of storage size
17
Lecture 5
Linear Time sorting, Medians/Order Stat
This lecture
problem complexity of comparison-based sorting (8.1)
noncomparison sorting: counting sort (8.2), radix sort
(8.3), bucket sort (8.4)
5.1 Comparison-based sorting: prob. complex-
ity
Decision Tree argument
Use a tree to represent all possible execution traces of an
algorithm (Fig. 8.1)
1:2 a[1] a[2]
<1, 2>
2 : 3
a[1] a[2] a[3]
<1, 2, 3>
a[2] a[3]
1 : 3
a[1] a[3] < a[2]
<1, 3, 2>
a[1] a[2]
a[3] < a[2]
a[1] a[3]
a[3] < a[1] a[2]
<3, 1, 2>
a[3] < a[1]
1 : 3
2 : 3 <2, 1, 3>
<2, 3, 1> <3, 2, 1>
>
>
a[1] > a[2]
<2, 1>
two-way branch on >, <, , or pairwise comparison
operator
Q: How many ways can the elements be ordered in a list of size
n?
n! ways (n-factorial, permutation)
rst position: n choices; second position: n1 choices..
leaf of tree algorithm nishes
path length from root to leaf number of steps (leaf
itself is just the result)
# leaves in decision tree:
# leaves # of possible outcomes.
# of leaves n!
less efcient algorithm would actually have more leaves
(redundant comparisons)
Runtime bounded by tree height: T(n) = (h)
root-to-leaf path length = height = lower bound on run-
time
a decision tree must have at least n! leaves,
binary tree of height h has 2
h
leaves (upper bound).
n! 2
h
. Take log on both sides,
lg(n!) lg(2
h
) = h.
from Ch. 3, page 55, lg(n!) = (nlgn) using Stirlings
approximation
h = (nlgn).
[Corollary 8.1] Any comparison sort algorithm requires
(nlgn) comparisons in the worst case.
[Corollary 8.2] HEAPSORT, MERGESORT, median partitioned
QUICKSORT are asymptitic optimal comparison sorts.
5.2 Non-comparison sorts
domain knowledge: e.g., exam scores, SAT scores, etc.
Clearly a lot more students than #of possible scores.
observation: sorting is based on a key of an object,
which might not be a unique identier
example of key: exam score; weight, height; income;
5.2.1 Counting sort
Idea:
1. count how many times each key occurs
2. convert the counts to offsets (to the end)
3. from the back, copy each key from input to ouput buffer
based on the offset; decrement
A[1..n] original source array
B[1..n] output array
18
C[1..k] key count; overloaded to be index
Initialize C[1..k] 0
foreach A[i]
do C[A[i]] ++
foreach i 2 to k
do C[i]+ =C[i 1]
computes Offset[i] := Offset[i 1] +C[i]
converts C from count to index to the end of its region
for i n downto 1 do
B[Offset[i] ] A[i]
2 5 3 0 2 3 0 3 A
input
count #times each key appears in input
2 2 3 1
0 1 2 3 4 5
C
output
layout
2 5 3 0
2 2 3 1
index to end
of region
C
2 4 7 8
Copy from
A[i] to B[C[i]]
in reverse
order
B 3
3 0
3 0 3
2 5 3 0 2 3 0 3 A
3 0 3 2
3 0 3 2 0
3 0 3 2 0 3
3 0 3 2 0 3 5
3 0 3 2 0 3 5 2
6
1
5
3
0
4
7
2
stable sort (relative positions preserved)
uses 2 buffer
complexity = O(n+k), effectively linear if k = O(n).
5.2.2 RadixSort
Idea: For keys that have digits
use stable sorting to sort by one digit at a time
start from least signicant digit!
3 2 9
4 5 7
6 5 7
8 3 9
4 3 6
7 2 0
3 5 5
input
Stable-sort by
least significant
digit first
7 2 0
3 5 5
4 3 6
4 5 7
6 5 7
3 2 9
8 3 9
Stable-sort by
2nd LSD
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
Stable-sort
by MSD
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
Important!! does not work if you start from MSD!!!
3 2 9
4 5 7
6 5 7
8 3 9
4 3 6
7 2 0
3 5 5
input
Stable-sort by
MSD first
Stable-sort by
2nd MSD
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
Stable-sort
by MSD
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
3 2 9
3 5 5
4 5 7
4 3 6
6 5 7
7 2 0
8 3 9
Oops! Not sorted
Need to do a lot of bookkeeping if really want to sort from
MSD.
3 2 9
4 5 7
6 5 7
8 3 9
4 3 6
7 2 0
3 5 5
input
Stable-sort by
MSD first
Stable-sort by
2nd MSD
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
Stable-sort
by MSD
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
3 2 9
3 5 5
4 5 7
4 3 6
6 5 7
7 2 0
8 3 9
also useful for sorting by multiple keys by name, by
SAT score, etc. start from the secondary before the
primary key
runtime complexity: (d(n+k)), where
d = # of digits in key,
k = radix (# possible values per digit)
reason: make d passes. Each pass is (n + k) (e.g.,
counting sort)
5.2.3 Bucket sort
assume values are distributed within some range.
project the value into n buckets, (insertion-) sort each
bucket
input
78 17 39 26 72 94 21 12 23 68
bucket
#
7 1 3 2 7 9 2 1 2 6
1
2
3
9
12 17
21 26 23
.
.
.
39
94
0
Timing Complexity: expected to be linear (to be revisited)
19
5.3 Median and Order statistics
5.3.1 Find min? max? both min and max?
n1 comparisons for min (keep the smallest so far)
also n1 comparisons for max
both min/max: 2n2
but could do better: 3n/2| comparisons
1. compare pairs (n/2| comparisons)
2. search for min in the smaller set
3. search for max in the bigger set
3 7 5 8 6 1 9 4
input
3
7
5
8 6
1
9
4
smaller
bigger
n/2 for pairwise compare
n/2 1 for min
n/2 1 for max
1
9
5.3.2 How to nd the i-th smallest w/out sorting?
keep track of min & max so far
nd the median
Use QuickSorts partitioning:
each pass: pivot is in the correct rank (absolute sorted
position)
if pivot is the position we want, return it
otherwise, continue partitioning in either one of the two
intervals.
RANDOMIZED-SELECT(A, p, r, i)
1 if p = r
2 then return A[p]
3 q RANDOMIZED-PARTITION(A, p, r)
4 k qp+1
5 if i = k the pivot value is the answer
6 then return A[q]
7 else if i < k
8 then return RANDOMIZED-SELECT(A, p, q1, i)
9 else return RANDOMIZED-SELECT(A, q+1, r, i k)
Complexity of (randomized) Select:
Best case: O(n) rst pass, very lucky, pivot is what we
want
Even partitioning: T(n) + T(n/2) + T(n/4) + ... +
T(2) +T(1) = T(2n) = O(n)
Worst case: T(n) +T(n1) +T(n2).. = O(n
2
)
Average time: use Randomized
1/n probability of each position i as pivot. expected cost
= sum up probability cost of i as pivot
T(n) =
n1

i=1
1
n
T(max(i, ni)) +O(n)
=
n1

i=n/2|
2
n
T(i) +O(n)
Guess T(n) an,
Verify
T(n)
2
n
n1

i=n/2|
(a i) +cn

2
n
n
2
(a(n1) +an/2|
2
+cn =
3
4
an+cn.
Choose a = 4c O(4n) = O(n)
5.3.3 Can show that worst-case select is O(n)
(Theoretical interest only)
Divide n elements into sublists of LENGTH 5 each, for
n/5| lists
Find the medians from these lists. Brute force (constant
time) on each list.
Recurse to nd the median x of these n/5| medians.
Partition around x. Let k = rank(x)
if (i = k) return x
if (i < k) then Select recursively to nd ith smallest in
rst part
else Select recursively to nd (i k)
th
smallest in second
part.
5.3.4 Justication:
(x is dont know, s = smaller, L = larger)
x x x L L L L
x x x L L L L
s s s * L L L
s s s s x x x
s s s s x x x
<
<
input
n elements
medians
of 5
median of medians
larger
smaller
larger smaller
< < < < < <
<
<
<
<
<
<
<
<
<
<
<
<
<
<
?
?
20
Claim: the median of medians of a list of length n has a
rank between 3n/106 and 7n/10+6.
Proof: There are (1/2 n/5||) 1 columns to the left
and right of the M-of-M.
The number of elements < or > than the M-of-M is
3#of columns to the left and right, respectively, plus
two in the M-of-Ms own column.
Its rank is 3(
1
2
n/5| 2) 3n/106

T(n) = T(n/5|) brute force median in n/5| columns


+ T(7n/10+6) recurse
+ (n) partitioning
T(n/5|) +T(
7n
10
+6) +cn.
Guess: T(n) an for some a
Verify
T(n/5|) +T(
7n
10
+6) +cn an/5| +a(
7n
10
+6) +c
1
n
a(n/5+1) +a(7n/10+6) +c
1
n
= (9/10)an+7a+c
1
n
if pick a = 20c
1
and n 140, then 7a +c
1
n
1
20
an +
1
20
an =
1
10
an.
This way, T(n)
9
10
an+
1
10
an = an
But a = 20c
1
is a bad constant!
21
Lecture 6
Binary Search Trees and Red-Black Trees
This lecture
Binary search trees
Python programming issues
Red-black trees
6.1 Binary Search Trees
binary tree
unique root
element = tree node x
required to have key[x] (or else kind of useless)
a left child le f t[x], a right child right[x], which are also
tree nodes.
either or both may be NIL. not required to have a child
no children a leaf node
binary SEARCH tree
idea: think QuickSort, with root = pivot
key[Le f t[x]] key[x] key[Right[x]] (whenever dened)
all keys in left subtree < pivot,
all keys in right subtree > pivot
example: for the list [2, 3, 5, 5, 7, 8]
a binary search tree:
5
3
2 5
7
8
another valid BST for the
same list:
5
3
2
5
7
8
6.1.1 API for BST
INORDER-TREE-WALK(x): print keys in BST in sorted
order
TREE-SEARCH(x, k): nd tree node with key k.
TREE-MAX(x) and TREE-MIN(x), assume x is root
TREE-SUCCESSOR(x) and TREE-PREDECESSOR(x),
for any tree node (need not be root)
TREE-INSERT(T, z) and TREE-DELETE(T, z)
INORDER-TREE-WALK(x)
1 if x ,= NIL
2 then INORDER-TREE-WALK(le f t[x])
3 print key[x]
4 INORDER-TREE-WALK(right[x])
Note: variants of tree visits (binary trees in general, not
necessarily BST)
inorder = recursive left, root, recursive right
preorder = root, recursive left, recursive right
postorder = recursive left, recursive right, root
Timing complexity of INORDER-TREE-WALK:
(n) time
Assume root x, left-subtree[x] has k nodes
right-subtree[x] has nk 1 nodes.
assume each call incurs overhead d
Proof by substitution method: into T(n) = (c +d)n +c
T(n) = T(k) +T(nk 1) +d
= ((c +d)k +c) +((c +d)(nk 1) +c) +d
= (c +d)n+c.
TREE-SEARCH(x, k)
1 if x =NIL or k = key[x]
2 then return x
3 if k < key[x]
4 then return TREE-SEARCH(le f t[x], k)
5 else return TREE-SEARCH(right[x], k)
returns x if found (key[x] matches k), NIL if not found
22
look in left subtree if smaller than x, right subtree if
larger
timing complexity is O(h), height of tree
best case O(lgn) if tree relatively balanced,
worst case O(n) if tree is a chain
TREE-MINIMUM(x)
1 while left[x] ,=NIL
2 do x left[x]
3 return x
TREE-MAXIMUM(x)
1 while right[x] ,=NIL
2 do x right[x]
3 return x
Max: go all the way to the right until a node with no right
child.
(could be an intermediate node)
Min: all the way left until a node w/ no left child.
Timing complexity O(h)
Successor and Predecessor
successor: the next larger element than x in tree
idea: smallest of the larger set (right subtree)
TREE-MINIMUM(right[x]), then were done!
5
3
2 4
8
7
TREE-SUCCESSOR(5) = min of right child
call TREE-MIN(8)
but wait what if x has no right children?
need to go up the tree, nd the youngest ancestor x
that is the left child of its parent y, and return y.
5
3
2 4
8
7
TREE-SUCCESSOR(4): find youngest ancestor x
who is a left child of its parent y, return y
x (3) is 4s youngest
ancestor who is a left
child of its parent y (5),
then return y
therefore, y = 5 is the successor of 4
Reason: w has no right child w is the TREE-MAX of
another subtree T.
Follow INORDER-TREE-WALK, we just visited all of T
(because w was the max of T and therefore last), return-
ing from recursion
Two possible places to be returning to:
2 INORDER-TREE-WALK(L[y])
3 print y
4 INORDER-TREE-WALK(R[y])
if we returned to line 4, we will return again (no more
successor at this level)
if we returned to line 2, the next to print will be y. This
means as soon as we nd the node = left[y], then we re-
turn y as the successor.
15
6
3 7
2 4
13
9
18
17 20
TREE-SUCCESSOR(x)
1 if right[x] ,= NIL
2 then return TREE-MINIMUM(right[x])
3 y p[x]
4 while y ,=NIL and x = right[y]
5 do x y
6 y p[y]
7 return y
Predecessor
max of left subtree (if exists)
mirror image of successor
Insertion:
nd the location were supposed to be:
when we reach a leaf,
new is bigger insert to the right of leaf
new is smaller insert to the left of leaf
5
3
2 4
7
8
insert(6)
5
3
2 4
7
8
> 5
< 7
belongs
here!
always add a new node as a leaf.
boundary condition is to add to an empty tree
convert an existing leaf to an intermediate node.
diff. order of insertion yields a different tree!
Delete:
nd it rst (the book calls it z)
if z is leaf just remove it. (no problem)
if z has one child c splice up:
make child c take zs position (like linked list)
5
3
2 5
7
8
Delete(7)
5
3
2 5
7
8
7 has only 1 child:
splice up
23
if z has two children nd and cut successor y, paste
over z
successor y must be either leaf or has one child; it
cant have two children.
Delete(5)
5 has 2 children
15
5
3
13
7
16
12
10
6
Step 1: Cut 5s successor = 6
(by splicing.)
(in case of leaf, just cut)
15
5
3
13
7
16
12
10
6
Step 2: Paste 6 over 5
15
3
13
7
16
12
10
6
Equivalently, could nd predecessor x, cut x and paste
over z.
Rotation:
Transform a BST while still preserveing BST property
x


y x

y
LEFT-ROTATE(x)
RIGHT-ROTATE(y)
Right rotate:
x is left child of y, want to make y the right child of
x.
x gets to keep its left child as before.
xs old right tree (which all have values x <
t < y) now becomes ys left child (i.e. promoted up
to xs old position)
(1) time! just x up pointers.
Acutally, can also think in terms of Rotate Up/Down:
Left Rotate x: x goes down to left child, y got ro-
tated Up to be new root
Right Rotate x: x was the left child, now gets pro-
moted one level up to root position; at the same
time y gets rotated down from root position.
6.2 Python Programming with Trees
object-oriented programming: Tree and Node classes
tuple encoding
6.2.1 Node data structure
class Node:
# constructor
def init (self, key):
self.key = key
self.left = None
self.right = None
self.p = None
To instantiate, call the constructor (name of the class) with
the params:
>>> x = Node(3) # makes a new node with key = 3
>>> x.key
3
Note that calling Node(3) actually calls Node classs
init (constructor) routine.
Use x.key, x.left instead of key[x], left[x], etc.
We can insert additional elds even after the constructor
is called!
example: x.color = red # add a new color at-
tribute
Example routine:
def TreeMaximum(x):
while x.right != None:
x = x.right
return x
6.2.2 Tuple-encoded Trees
Tuples vs Lists:
list: L = [1, 2, a, b, c ]
tuple: T = (1, 2, a, b, c)
Similarities:
Both tuples and lists let you access elements by their in-
dices: e.g. L[3] and T[3] both yield the value of b.
len works for both. Both len(L) and len(T) yield
5.
Differences:
syntax: tuples use ( ); lists use [ ]
tuples are immutable (read only)
not allowed to say T[3] = z
OK to do lists L[3] = z
tuples can be compared lexigraphically:
(1, 2) < (1, 2, 3) < (1, 3)
Can have tuples within tuples:
(1, (2, 3, (4, 5)), 6)
24
Tuples as Tree Nodes
Pre-order encoding: (key, left, right)
Example: (5, (3, (2, None, None), (5, None, None)), (7,
None, (8, None, None)))
5
3
2 5
7
8
convenient way of specifying a tree test case
Also for printing out the trees
To convert between tree data structure and tuples:
def TupleToTree(t):
if t == None:
return None
# make a new node
n = Node(t[0])
n.left = TupleToTree(t[1])
n.right = TupleToTree(t[2])
return n
def TreeToTuple(x):
if x == None:
return None
return (x.key,
TreeToTuple(x.left),
TreeToTuple(x.right))
6.3 Red-Black Trees
6.3.1 Five Properties: (important!!)
1. Every node is either red or black.
2. The root is black.
3. (conceptual) Leaf is black actually, there are real
nodes vs. conceptual nodes. All real nodes have two
children. A NIL is not a real node and is the only kind of
node that can have no children.
4. Red node has two black children (could be NIL)
5. All simple paths from a node to a descendant leaf contain
the same number of black nodes. (black height)
Denition: black height bh(x)
# of black nodes on a path to a leaf, not including x itself.
a height-h node x has blackheight bh(x) h/2.
This is by property 4: since you cant see two red nodes
in a row, the most you can see is every other red, or h/2.
number of vertices n is bounded by the height h:
upperbound: full tree w/ red/blacks: n 2
h
1
lower bound: black only: n 2
bh(x)
1.
but since bh(x) h/2, we have
n 2
h/2
1.
rewrite lower bound:
n+1 2
h/2
, take log on both sides
lg(n+1) h/2, move 2, swap sides
h 2lg(n+1)
6.3.2 How to maintain the Red-Black Tree proper-
ties?
Example: color scheme?
7
5 9
12
7
5 9
12
(a) This is OK.
7 is root (forced
black); 12 red pre-
serves bh
7
5 9
12
(b) not OK
7
5 9
12
(c) Is this one
ok?
Insertion using BST routines
insert(8) (as new leaf) to tree (a) above:
left child of 9.
need to color it red like 12.
7
5 9
12 8
insert(11) as left child of 12. What color should 11 be?
7
5 9
12 8
11
what color?
7
5 9
12 8
11
7
5 9
12 8
11
bh = 4
everywhere else bh = 3
red nodes must have
two black children!
if red 12 is red... but 12 must have two black
children. (property 4)
if black 12 black height thru 11 is longer than
other paths (violates property 5)
Solution: recolor the tree!
7
5 9
12 8
11
insert as red
change color
scheme
But! what if we Insert(10)? (left child of 11)
25
7
5 9
12 8
11
10
no feasible coloring
scheme!
need structural x
want to x it in O(lgn) time.
6.3.3 Insertion
call BST Insert z as a red leaf.
black height didnt change
if the new node z has a black parent, done.
only possible violation: z and its parent both red
To x it: (1) recoloring, (2) rotating
3 cases for Insertion xup
problem: Two reds on adjacent levels
approach: separate the two reds with a black by swaping
colors from above if possible. push the red bubble up
towards the root)
given parent[z] = red Grandma[z] must be black.
7
5 9
12 8
11
10
insert(z)
as red
grandma[z]
must be black!
?
red: case 1,
=> swap color levels
black: case 2 or 3
=> rotate followed by color swap
{
Case 1: grandma has two red children
solve by color-level swapping
We didnt violate black height!
we can push the red level up the tree, by swapping level
colors.
7
12
11
10
insert(z)
as red
grandma[z]
12
11
10
insert(z)
as red
push
red up
these
are
OK
}
}
if great-
grandma is
red, then
need more
fixing
great-
grandma[z]
However, weve moved the problem somewhere else.
Grandma is now red could have two-consecutive red.
climb up the tree to x it.
Cases 2 and 3: mom is red, aunt is black
solve by Rotations followed by color swapping
Case 2: (Grandma, mom, myself) are related as (L-child,
L-child) chain or (R-child, R-child) chain
1. up-rotate mom to grandmas position
2. problem: mom and I are still both red! solution: swap
mom and (original) grandmas colors
12
11
10
insert(z)
as red
grandma[z]
Step 1: rotate zs mom up to
grandmas position
aunt[z]
is black

12
11
10

Step 2: swap colors betw.
mom & grandma (original)

12
11
10

Tree is OK
Case 3: (Grandma, mom, myself) are either L-R or R-L
chain
1. up-rotate myself to moms position.
2. This transforms it into an L-L or R-R chain, which is
case 2. Continue as case 2.
12
10
11
insert(11)
as red
grandma[z]
Step 1: up-rotate z up to its
moms position
aunt[z]
is black

12
11
10
insert(z)
as red
grandma[z]
Step 2: we just transformed
case 3 into L-L chain or R-
R chain,(which is case 2)
so we resume as case 2..
aunt[z]
is black

6.3.4 Deletion
Modied based on the BST delete routine
4 cases instead of 3 when xup
13.4 as optional reading.. dont worry about details.
6.3.5 Tree data structure
maintains the root
to act as a front-end to the nodes, and serves as a good
place to group together related routines as methods.
class Tree:
# constructor
def init (self, key):
self.root = None
def insert(self, key):
# your code for insert
def delete(self, key):
# your code for delete
Object-oriented vs. Procedure-oriented programming:
Procedure (textbook): TREE-INSERT(T, x)
Object oriented method call: T.insert(x)
The method must be declared as
def insert( self , key)
self must be the rst parameter! (corresponds to
T)
to access instance variables (e.g., root), must say
self.root.
Can mix both styles
26
Lecture 7
Dynamic Programming
This lecture: Dynamic programming
Optimization problems and Dynamic programming
Assembly line scheduling
matrix multiply
longest common subsequencce
optimal binary search tree
7.1 Optimization problems
many possible solutions with different costs
want to maximize or minize some cost function
unlike sorting its sorted or not sorted.. partially sorted
doesnt quite count.
examples: matrix-chain multiply (same results, just
faster or slower)
knap-sack problem (thief lling up a sack), compression
7.1.1 Dynamic programming
programming here means tabular method
Instead of re-computing the same subproblem, save re-
sults in a table and look up (in constant-time!)
signicance: convert an otherwise exponential/factorial-
time problem to a polynomial-time one!
Problem characteristic: Recursively decomposable
Search space: a lot of repeated sub-congurations
optimal substructure in solution
7.2 Case study: Assembly line scheduling
two assembly lines 1 and 2
each line i has n stations S
i, j
for n stages of assembly
process
each station takes time a
i, j
chassis at stage j must travel stage ( j +1) next
option to stay in same assembly line or switch to
the other assembly line
time overhead of t
i, j
if decided to switch to line to
go to S
i, j
.
assembly
line 1
assembly
line 2
stage 1 stage 2 stage 3 stage 4
2
2
1
3
2
1
stage 5 stage 6
2
3
1
4
4
2
3
2
7 9
5
3
6
4
4
8
5
4
7 5
Want to minimize time for assembly
Line-1 only: 2+7+9+3+4+8+4+3 = 40
Line-2 only: 4+8+5+6+4+5+7+2 = 41
Optimal: 2+7+(2) +5+(1) +3+(1) +4+5+
(1) +4+3 = 38
How many possible paths? 2
n
(two choices each stage)
Optimal substructure
Global optimal contains optimal solutions to subprob-
lems
Fastest way through any station S
i, j
must consist of
shortest path from beginning to S
i, j
shortest path from S
i, j
to the end
That is, cannot take a longer path to S
i, j
and make
up for it in stage ( j +1). . . n.
27
Notation: f
i
[ j] = fastest possible time from start through
station S
i, j
(but not continue)
e
i
, x
i
are entry/exit costs on line i
Goal is to nd f

global optimal
initially, at stage 1 (for line l = 1 or 2),
f
l
[1] = e
l
(entry time) +a
l,1
(assembly time)
at any stage j >1, line l, (and m donotes the other line)
f
l
[ j] = min
_
f
l
[ j 1] same line l
f
m
[ j 1] +t
m,( j1)
other line m + transfer
_
+ a
l, j
(assembly time at station S
l, j
)
Can write this as a recursive program:
F(i, j)
if j = 1
then return e
i
+a
i,1
else return min(F(i, j 1),
F(i%2+1, j 1) +t
i%2+1, j1
)
+a
i, j
But! There are several problems:
many repeat evaluation of F(i, j), could be O(2
n
) time
use a 2-D array f [i, j] to remember the running mini-
mum
does not track the path
use array l
i
[ j] to remember which path gave us this
min
Iterative version shown in book on p. 329.
Run time is (n).
7.3 Matrix-Chain multiply
Basic Matrix Multiply
A is pq, B is qr
q
p
q
r
p
r
A B AB
product is pr matrix:
c
i, j
=

y=1...q
a
i,y
b
y, j
total number of scalar multiplications = pqr
Multiply multiple matrices
matrix multiplication is associative:
(AB)C = A(BC)
Both yield ps matrix
Total # multiplications can be different! (added)
(AB)C is
pqr to multiply AB rst,
+ prs to multiply (AB)(pr) w/ C(r s)
= pqr + prs total # multiplications
On the other hand, A(BC) is pqs +qrs
Example: if p = 10, q = 100, r = 5, s = 50, then
pqr + prs = 5000+2500 = 7500
pqs + qrs = 50000 + 25000 = 75000 ten times as
many!
Generalize to matrix chain: A
1
, A
2
, A
3
. . . A
n
But there are many ways!
P(1) = 1 (A) nothing to multiply
P(2) = 1 (AB)
P(3) = 2 A(BC), (AB)C
P(4) = 5 A(B(CD)), A((BC)D), (AB)(CD), (A(BC))D, ((AB)C)D
P(5) = 14 . . . Exponential growth
P(n) =
_
_
_
1 if n = 1,

n1
k=1
P(k) P(nk) for n 2
(4
n
/n
3/2
) at least exponential!
7.3.1 optimial parenthesization to inimize # scalar
mults
((A
1
A
2
)
A
1...2
A
3
) (A
4
A
5
)
A
4...5
1..k k+1..n
A
1...5
A
1...3
Notation:
let A
i... j
denote matrix product A
i
A
i+1
A
j
matrix A
i
has dimension p
i1
p
i
Optimal substructure
if optimal parenthesization for A
1... j
at the top level is
(L)(R) = (A
1...k
)(A
k+1... j
), then
L must be optimal for A
1...k
, and
R must be optimal for A
k+1... j
Proof by contradiction
Let M(i, j) = Minimum cost from the i
th
to the j
th
matrix
M(i, j) =
_

_
0 if i = j,
min
ik<j
M(i, k) +M(k +1, j) + p
i1
p
k
p
j
if i < j
As a recursive algorithm (very inefcient!):
M(i, j)
if i = j
then return 0
else return M(i, k)+ M(k +1, j) + p
i1
p
k
p
j
28
Observation
dont enumerate the space!
bottom up no need to take the min so many times!
instead of recomputing M(i, k), remember it in array
m[i, k]
book keeping to track optimal partitioning point. See
Fig.7.1
O(n
3
) time, (n
2
) space (for m and for s arrays)
7.4 Longest common subsequence
Example sequence X =A, B, C, B, D, A, B),
Y =B, D, C, A, B, A)
a subsequence of X is Z =B, C, D, B)
Longest common subsequence (LCS) of length 4:
B, C, B, A), B, D, A, B)
This is a maximization, also over addition, but add cost
by 1 (length increment)
Brute force:
enumerate all subsequences of x (length m), check if its
a subsequence of y (length n)
#of subsequences of x = 2
m
(binary decision at each
point whether to include each letter)
worst case time is (n 2
m
) because for each one check
against ys length = n
Better way:
Notation: X
k
= length-k prex of string X
x
i
is i
th
character in string X
Z = z
1
. . . z
k
) is an LCS of X = x
1
. . . x
m
) and Y =
y
1
. . . y
n
)
if x
m
= y
n
then z
k
= x
m
= y
n
, and
Z
k1
is an LCS of X
m1
,Y
n1
.
if x
m
,= y
n
then
if z
k
,= x
m
then Z is LCS of X
m1
,Y
if z
k
,= y
n
then Z is LCS of X,Y
n1
.
c[i, j] = LCS length of X
i
,Y
j
c[i, j] =
_
_
_
c[i 1, j 1] +1 if x[i] = x[ j](match)
max
_
c[i, j 1]
c[i 1, j]
_
(no match, advance either)
Algorithm
c[1 : m, 0] 0; c[0, 1 : n] 0
for i 1 to m
do for j 1 to n
do if (x
i
= y
j
)
then c[i, j] c[i 1, j 1] +1
b[i, j] match ()
else if (c[i 1, j] c[i, j 1])
then c[i, j] c[i 1, j] copy the longer length
b[i, j] dec i ()
else c[i, j 1] > c[i 1, j]
c[i, j] c[i 1, j]
b[i, j] dec j ()
0
0 0 0 0 0 0 0
j 0 1 2 3 4 5 n =6
i
0
1
2
3
4
5
6
m=7
A
B
C
B
D
A
B
B D C A B A
0 0 0 0 0 0 0
0
0
0
0
0
0
0
i=1 A
0
B D C A B A
0
0 0 0 1 1 1
i=2 B
B D C A B A
1 1 1 1 2 2
i=3 C
0
B D C A B A
1 1 2 2 2 2
i=4 B
0
B D C A B A
1 1 2 2 3 3
i=5 D
0
B D C A B A
1 2 2 2 3 3
i=6 A
0
B D C A B A
1 2 2 3 4 3
i=7 B
0
B D C A B A
1 2 2 3 4 4
Time (mn), Space (mn).
7.5 Optimal Binary Search Trees
input: n keys K =k
1
, k
2
, . . . , k
n
)
n+1 dummy keys D =d
0
, d
1
, . . . , d
n
)
d
0
< k
1
< d
1
< k
2
< d
2
< . . . < k
n
< d
n
key k
i
has probability p
i
, and
dummy key d
i
has probability q
i
, and
n

i=1
p
i
+
n

i=0
q
i
= 1
want: Binary tree that yields fastest search: (fewer steps)
for frequently used words
k
i
keys should be internal nodes, and d
i
dummy keys
should be leaves.
29
MATRIX-CHAIN-ORDER(p)
1 n length[p] 1
2 for i 1 to n
3 do m[i, i] 0
4 for l 2 to n: l = length of interval considered
5 do for i 1 to (nl) +1:
starting index, from 1 up to n length for each length
6 do j i +l 1
ending index, always length away from the starting index
7 m[i, j]
8 for k i to j 1:
different partitions between i and j
9 do q m[i, k] +m[k +1, j] + p
i1
p
k
p
j
.
10 if (q < m[i, j]):
11 then m[i, j] q
12 s[i, j] k remember best k between i, j
13 return m, s
A
i...j
length
l
i = starting
index
of chain
n
n l + 1
j = i + l 1
= ending
index
of chain
1
k from i to j 1
chain length l from 2 to n
starting index i from 1 to n l + 1
(implies ending index j)
Figure 7.1: Matrix-Chain-Order Algorithm and graphical illustration.
optimize for common case. Balanced tree might not be
good!
Example tree (not optimal):
k
2
0. 10
k
1
0. 15
k
4
0. 10
d
0
0. 05
d
1
0. 10
k
3
0. 05
k
5
0. 20
d
2
0. 05
d
3
0. 05
d
4
0. 05
d
5
0. 10
Expected search cost of tree T
Optimal substructure: if root=k
r
, L = (i . . . r 1), R =
(r +1. . . j) L, R must be optimal subtrees.
Expected cost e[i, j]
e[i, j] =
_

_
q
i1
(a dummy leaf ) if j = i 1
min
irj
e[i, r 1] (left subtree)
+e[r +1, j] (right subtree)
+w(i, j) (add one depth) if i j
k
2
0. 10
k
1
0. 15
k
4
0. 10
d
0
0. 05
d
1
0. 10
k
3
0. 05
k
5
0. 20
d
2
0. 05
d
3
0. 05
d
4
0. 05
d
5
0. 10
d
0
0. 05
e[1,0]
= q
0
= 0.05
d
1
0. 10
e[2,1]
= q
1
= 0.10
e[1,1] =
p
1
(0.15 for k
1
)
+ e[1,0] (left tree)
+ e[1,1] (right tree)
+ one more depth
for all nodes in
Left & Right subtrees
k
1
0. 15
d
0
0. 05
d
1
0. 10
d
0
0. 05
d
1
0. 10
k
1
0. 15
d
0
0. 05
d
1
0. 10
e[1,5] =
w[1, 5]
+ e[1,0] (left tree)
+ e[1,1] (right tree)
w[1,1] = p
1

+ one more depth
for all nodes
in subtrees
e[1,1] = w[1,1] + e[1,0] + e[1, 1]
k
1
0. 15
d
0
0. 05
d
1
0. 10
k
4
0. 10
k
3
0. 05
d
2
0. 05
d
3
0. 05
k
5
0. 20
d
4
0. 05
d
5
0. 10
use arrays to remember e[i, j], w[i, j] instead of recom-
puting
use array root[i, j] to remember root positions
OPTIMAL-BST(p, q, n) (page 361)
1 for i 1 to n+1
2 do e[i, i 1] q
i1
3 w[i, i 1] q
i1
4 for l 1 to n
5 do for i 1 to nl +1
6 do j i +l 1
7 e[i, j]
8 w[i, j] w[i, j 1] + p
j
+q
j
9 for r i to j
10 do t e[i, r 1] +e[r +1, j] +w[i, j]
11 if t < e[i, j]
12 then e[i, j] t
13 root[i, j] r
14 return e, root
30
Lecture 8
Greedy Algorithm
This lecture
greedy algorithms: make choice that looks best at the
moment
activity selection problem
Huffman (with Python programming hints)
8.1 Activity Selection problem
Set S of n activities
s
i
= start time of activity i,
f
i
= nish time of i.
assume 0 s
i
< f
i
< . (nite, positive/nonzero execu-
tion delay)
i, j are compatible
if f
i
< f
j
implies f
i
< s
j
. (i.e., no overlap)
Goal: nd a maximal subset A of compatible activities
maximize [A[ = # of activities
NOT maximize utilization!!
there many be several optimal solutions, but its
sufcient to nd just one.
Greedy Solution: pick next compatible task with earliest
nish time.
Example
0 2 4 6 8 10 12 14
l
i
s
t

o
f

t
a
s
k
s

s
o
r
t
e
d

b
y

f
i
n
i
s
h

t
i
m
e

f
i

o
p
t
i
m
a
l


s
c
h
e
d
u
l
e
s
Notation
S =a
1
. . . a
n
set of activities.
a
0
= start, a
n+1
= end (fake) activities: f
0
=
0, s
n+1
= .
S
i j
=a
k
S : f
i
s
k
< f
k
s
j

(i.e., set of activities compatible with a


i
, a
j
)
S = S
0,n+1
assumption: sorted by nish time:
f
0
f
1
f
2
f
n
f
n+1
Claim: if i j then S
i j
= / 0.
impossible to start later and nish earlier
31
Optimal substructure
Let A = an optimal set (activity selection) for S
A
i, j
= optimal for S
i, j
.
if a
k
A
i, j
then
A
i,k
must be optimal solution to S
i,k
Proof: Assume there exist A
/
i,k
with more activities
than our A
i,k
(that is [A
/
i,k
[ >[A
i,k
[.
we can construct a longer A
/
i, j
by using A
/
i,k
pre-
x. Contradiction to the original claim that A is
optimal.
Formulation:
Want to maximize c[0, n+1] where c[i, j] = max # of compati-
ble activities in S
i, j
.
c[i, j] =
_

_
0 if S
i, j
= / 0
max
i<k<j
c[i, k] +c[k, j] +1 if S
i, j
,= / 0
(note: S
i,k
and S
k, j
are disjoint and do not include a
k
)
looks like dynamic programming formulation, but
we can do better with greedy.
if a
m
S
i, j
has earliest nish time, then
1. S
i,m
= / 0
because if not empty, there must exist a
k
that starts
after a
i
and nish before a
m
,
but if a
k
nishes before a
m
, then a
k
should have
been picked. Contradiction.
2. a
m
is a member of some maximal A
i, j
if not, we can always construct A
/
i, j
= A
i, j
a
k

a
m
, and [A[ =[A
/
[.
a
i
S
i,j
S
i,m
S
i,m
a
j
a
k
a
m
A
ij
A
ij
a
m
a
k
S
i,m
should be empty
incompatible
activities
If a maximal A
ij
starts with
a
k
a
m
, it should always be
possible to create another A
ij

by replacing a
k
with a
m
.
What does this mean?
earliest-nish-time-rst is always as good as any optimal
optimal substructures for later half dont depend on ear-
lier optimal substructure
Greedy Algorithm (iterative version, p. 378)
GREEDY-ACTIVITY-SELECTOR(s, f )
assume f already sorted!
1 n length[s]
2 A a
1

3 i 1
4 for m 2 to n
5 do if s
m
f
i
next compatible with earliest nish
6 then A Aa
m
;
7 j m
8 return A
Another way to view this:
intuition - leave the largest remaining time
What about least duration?
8.2 Greedy choice property
global optimal is made up of the greedy choice plus an
optimal subproblem
prove greedy choice is a member of the global optimal
Optimal substructure:
showing that removing the greedy choice yields a solu-
tion to the subproblem.
Greedy vs. Dynamic programming: Knapsack prob-
lem
Thief with a knapsack
weight limited rather than volume limited
Goal: maximize the value of stolen goods under weight
limit
Variations:
0-1 knapsack problem: whole items
(watches, nuggets)
fractional knapsack problem can take parts
(sugar, salt)
Important distinction:
Greedy works for the fractional knapsack problem
always take things with maximum value per unit weight
Greedy does not work for 0-1
by counter example:
Limit: 50 lb, items $60/10, $100/20, $120/30
32
Ratios $6, $5, $4
Greedy solution: take item 1, item 2, but
item 3 wont t get $160/30
Optimal is take item 2, item 3, leave item 1
$220/50
$60
10kg
0
10
20
30
40
50
kg
$100
20kg
$120
30kg
knap
sack
50kg
$:wt
ratio
6 5 4
Greedy is not optimal for 0-1 knapsack:
highest $:wt first, keep adding until exceeding
knapsack capacity
$60
10kg
$100
20kg
add first
add second
$120
30kg
oops! wont fit!
must leave behind
result: $160 / 30kg
Optimal for 0-1:
in this case, best fit
$100
20kg
$120
30kg
result: $220 / 50kg
Greedy would work for fractional knapsack:
take a fraction of the pink part
result: $240 / 50kg
How to solve 0-1 knapsack?
dynamic programming, in O(nW) time.
idea: iterate over total weights up to the limit
need to assume unit weight
8.3 Huffman codes
xed-length code:
same # bits for all characters
variable length code:
more frequent use fewer bits (e.g. vowels like a e
i o u vs. q z)
Goal: minimize

codewordcC
frequency(c) bitlength(c)
non-lossy! preserves the same information
Comparison example: 6 characters
a b c d e f
frequency 45 13 12 16 9 5
xed length codeword 000 001 010 011 100 101
var-length codeword 0 101 100 111 1101 1100
Why is variable length better?
xed (3-bits per char) 100,000 chars 300,000 bits
variable length: (45 1+(13+12+16) 3+(9+5) 4) =
224, 000 bits (25% saving)
problem: how do we know where to begin each code-
word?
prex codes: no codeword is a prex of another code-
word. e.g., only a starts with 0. So, 00 means (a a).
1 need to look at more bits:
10 is prex for either b (101) or c (100)
11 is prex for d, e, f
Example: 001011101 uniquely parses as
0(a) 0(a) 101(b) 1101(e), no ambiguity
Algorithm
HUFFMAN(C) (page 388)
C is the array of nodes that carry (letter, frequency)
1 n [C[
2 Q C put all elements into priority queue
3 for i 1 to n1
4 do z MAKE NEW NODE
5 left[z] x EXTRACTMIN(Q)
6 right[z] y EXTRACTMIN(Q)
7 f [z] f [x] + f [y]
8 INSERT(Q, z)
9 return EXTRACTMIN(Q)
Python version
http://e3.uci.edu/02f/15845/huffman.html
33
14
25 30
55
100
0 1
1
1
1
0
0
0
0 1
initially: min-priority Q gets character/frequency nodes
'f': 5 'e': 9 'c': 12 'b': 13 'd': 16
Q
i=1:
14
1. extract two nodes
2. make a binary tree
z with sum freq.
3. insert z into Q
'c': 12 'b': 13 'a': 45
Q 14
'f' 'e'
'f': 5 'e': 9
z:
i=2:
25
'c' 'b'
Q
'a': 45
14
'f' 'e'
25
'c' 'b'
'a': 45
'd': 16
'd': 16
i=3:
Q
'a': 45
25
'c' 'b'
14
'f' 'e'
'd': 16
30
30
14
'f' 'e'
'd'
continued... when all done,
'a' : 45
'b' : 13 'c' : 12 'd' : 16
'e' : 9 'f' : 5
Time complexity:
[n[ 1 calls, heap is O(lgn), so this is O(nlgn) for n
characters
Correctness
Greedy property:
C = alphabet. x, y C are two chars with lowest frequen-
cies. (most rarely used)
then there exist an optimal prex code such that
the codewords for x and y have the same length and
differ only by the last bit value.
If a, b are leaves at deepest depth and have higher
frequency, then we can swap a, b with x, y and ob-
tain a new tree with a lower cost of sum of freq
depth.
Optimal substructure:
T
/
is an optimal tree for alphabet C
/
T is an optimal tree for alphabet C if
C and C
/
are the same except x, y C while z C
/
,
f is the same except f [z] = f [x] + f [y]
we construct T from T
/
by replacing leaf node for
z with internal node that is parent to x, y.
34
Lecture 9
Basic Graph Algorithm
This lecture
Graph data structure
Graph traversal ordering denition
9.1 Graphs G = (V, E)
V: vertices. Notation: u, v V
E: edges, each connects a pair of vertices
E V V; (u, v) E
Variations
undirected or directed graph (digraph)
weighted: by default, weighted edges
w : E R
Notation: weight for edge (u, v) is written as w(u, v)
Also possible to have weighted vertices
Undirected Graphs
degree of a vertex: # of edges connected to the vertex
complete graph, aka clique:
undirected graph where all pairs of vertices are con-
nected
bipartite graph:
undirected graph whose V =V
1
V
2
, and E V
1
V
2
multigraph: can have multiple edges between the same
pair of vertices (including self edges)
hypergraph: an edge can connect more than two vertices
Digraphs
in-degree: # incoming edges,
out-degree: # outgoing edges
path: v
0
, v
1
, . . . , v
k
), where (v
i
, v
i+1
) E
simple path: a path where all v
i
are distinct
cycle: a nontrivial simple path plus (v
k
, v
0
) to close the
path
DAG: directed acyclic graph (no cycle)
strongly connected: digraph whose vertices are all reach-
able from each other.
Representation
Adjacency list
Associate each vertex with a linked list of its neighbors
on its outgoing edges
good for sparse graphs
for weighted graph, need to store weight in linked list
Adjacency matrix
A bit matrix to indicate presence/absence of edge
weighted: can store weight in matrix
u v
y x
w
V
u
v
w
x
y
Adjacency list
v y
x w
x
y
v
u
v
w
x
y
u v w x y
1 1
1 1
1
1
1
Adjacency matrix
tradeoffs:
Adj. List: easier to grow #of vertices, good for sparse
Adj. Matrix: faster access, but O(V
2
) storage
35
9.2 Breadth-rst search
distance here means # of edges, not edge weight!
e.g., start from u:
= 0: u
= 1: v, y
= 2: w, x
u v
y x
w
= 1
= 2
start
visit vertices in increasing shortest distance
visit all of distance d before visiting distance d +1 ver-
tices
BFS order may not be unique!!
Example: BFS(G, u) can be
[u, v, y, w, x], [u, v, y, x, w], [u, y, v, w, x], [u, y, v, x, y]
Algorithm
BFS(G, s)
1 for each u V[G] s
2 do color[u] WHITE
3 d[u]
4 [u] NIL
5 color[s] COLOR
6 d[s] 0
7 [s] NIL
8 Q / 0
9 ENQUEUE(Q, s)
10 while Q ,= / 0
11 do u DEQUEUE(Q)
12 for each v Adj[u]
13 do if color[v] = WHITE
14 then color[v] GRAY
15 d[v] d[u] +1
16 [v] u
17 ENQUEUE(Q, v)
18 color[u] BLACK
main idea: for each vertex of d, enqueue vertices of d +1
avoid queuing more than once: use color
WHITE: never been enqueued
GRAY: currently in queue
BLACK: already visited
actually COLOR is somewhat redundant... enough
to test (predecessor)
Lemma 22.1: for any edge (u, v) E, (s, v) (s, u)+1
if we can reach u in edges, we can reach v in +1
edges (by taking edge (u, v))
v might actually be closer than u, but thats ok
d[v] computed by BFS satises d[v] (s, v)
initialize to d[u] = (s, u)
base case: since we start with d[s] = 0, all those u
with (s, u) will have d[u] = (s, u) = 1
induction: assume d[u] = (s, u), consider edge
(u, v).
if color[v] ,= WHITE then it has been visited before
and d[v] < (s, u) +1.
if color[v] = WHITE then it is being visited for the
rst time: d[v] = (s, u) +1 = d[u] +1 = (s, v).
9.3 Depth-rst search
depth is not necessarily the same as distance in BFS!
discover vertices (use a stack) before we visit!
pushing vertices on stack (if we havent pushed it before)
DFS order may not be unique!
Example: DFS with tree
u
v
x y
w
z
Output when closing parenthesis: [x, z, y, v, w, u]
start
u
v
x
z
w
(u (v (x x) (y (z z) y) v) (w w) u)
1 2 3 4 5
y
6 7 8 9 10 11 12
DFS(G)
1 for each v V[G]
2 do color[u] WHITE
3 [u] NIL
4 time 0
5 for each u V[G]
6 do if color[u] = WHITE
7 then DFS-VISIT(u)
DFS-VISIT(u)
1 color[u] GRAY discovered u
2 timetime +1
3 d[u] time
4 for each v Adj[u]
5 do if color[v] = WHITE
6 then [v] u
7 DFS-VISIT(v)
8 color[u] BLACK
9 f [u] time time +1
36
Example: DFS with DAG
u v
x y
w
z
One possible DFS order:
u
v
y
x
Output: [x, y, v, u, z, w]
Another possible DFS order:
u v
x y
w
z
Output: [z, v, x, y, w, u]
(u (v (y (x x) y) v) u) (w (z z) w) (w (z z) (y (x (v v) x) y) w) (u u)
1 2 3 4 5 6 7 8 9 10 11 12
w
z
w
v
y
x
1 2 3 4 5 6 7 8 9 10 11 12
u
z
generalization of of post-order traversal
doesnt need a root: all vertices are considered
automatically discovers nodes when disconnected
Time complexity: (V +E):
DFS main routine is (V),
DFS-VISIT is [Adj[v][ = (E) times
Color Encoding
WHITE: never been visited
GRAY: been pushed on the stack, not yet popped
BLACK: been visited (popped from stack).
Classication of edges
G

= depth-rst forest = set of subgraphs of G induced


by DFS when assigning
Tree edges: edges (u, v) G

. discovered when iterating


Adj[u] (line 4 of DFS-VISIT)
(u, v =WHITE) (i.e., v has never been visited).
Back edges: edge from descendant to ancestor:
(u, v =GRAY) (i.e., v is still on the stack)
Forward edges: ancestor to descendant but not a tree
edge.
(u, v =BLACK) (i.e., v has been visited and is a descen-
dant)
Cross edges: either from a later tree to an earlier tree, or
from one later branch to an earlier branch
(u, v =BLACK) (i.e., v has been visited but not a descen-
dant)
u v
x y
w
z
u
v
y
x
1 2 3 4 5 6 7 8 9 10 11 12
w
z
u
v
w
x
y
z
forward edge
backward edge
cross edge
tree edge
9.4 Topological sort
DAGs only! (directed acyclic graphs)
produces a serialization scheme for partially ordered ver-
tices
a vertex is eligible when all of its predecessors have
been visited.
multiple possible topological sorts
t
v u
w
y
x
t
v u
w
y
x
t
v u
w
y
x
t
v u
w
y
x
t
v u
w
y
x
eligible Topological sort: [t, u, v, x, w, y]
Another topological sort: [t, v, x, w, y, v]
Algorithm: can use a queue to track eligible vertices
Book: uses DFS in reverse
Run DFS, but output vertices in reverse order
Does not nd all topological orders; only DFS based
Justication:
in DFS, v not visited until all of its descendants have
been visited.
in reverse: v is output before all of its descendants.
9.5 Strongly Connected Components (SCC)
SCC: maximal set of vertices C V such that all vertices
are reachable from each other in V of a digraph G(V, E).
How to nd SCCs?
run DFS(G)
call DFS(G
T
) but consider vertices in order of de-
creasing f [u]
output one SCC for each call on line 7 of DFS(G
T
)
a b
e f
d
h
c
g
f h
2 4 6 8 10 14
b
c
g
d
12 16
e
a
DFS(G)
DFS(G
T
) a b
e f
d
h
c
g
c g
f
h
2 4 6 8 10 14
b
d
12 16
e
a
Component
graph
G
SCC
a, b, e c, d f, g h
a, b, e c, d
f, g h
Why does this work?
Claim: G and G
T
have the same SCCs.
37
(Lemma 22.13) Component graph G
T
is a DAG
Notation:
C denote a strongly connected component
d[C] = min
vC
d[v] (earliest of discovery times)
f [C] = max
vC
f [v] (latest of nish times)
C,C
/
two distinct SCCs of G
(Lemma 22.14) if edge (u, v) E and C C
/
, then
f (C) > f (C
/
).
Intuition: C
/
is deeper than C and therefore n-
ishes earlier.
(Lemma 22.15 paraphrased from book!)
(v, u) E
T
and C
/
C, then f (C
/
) < f (C).
Proof: (v, u) E
T
(u, v) E
because v C
/
is deeper (nishes earlier), f (C
/
) <
f (C)
u v
C
C'
f(C) > f(C')
in G
in G
T
u v
C
C'
f(C) > f(C')
means G contains
u v
Lemma 22.15 says cross-SCC edges in G
T
must go from
C
/
to C already visited earlier!
(2nd pass DFS starts from larger f (C) rst, smaller
f (C
/
)).
Each depth rst forest output from DFS(G
T
) outputs ex-
actly one SCC.
9.6 Python Programming: How to represent a
graph!?
Dont do it the C++, Java way!
dene classes for vertices, edges, graphs
dene methods for adding/removing vertices,
edges
dene access method for getting/setting key
Get garbage when trying to look at a graph, vertex,
edge...
Adjacency list, using Python dictionary
9.6.1 Simple: Directed graph, no weights
Adj = {
s: [t, y],
t: [x, y, z],
x: [z],
y: [t, x, z],
z: [s, x]
}
s
t x
y z
Adjacency list: Adj[t]
>>> Adj[t]
[x, y, z]
Vertices V of G
>>> Adj.keys()
[t, z, s, x, y]
# Note: could be in another order!
However! several limitations with this approach:
does not handle edge weights
not efcient to test if (u, v) E (search list)
Exposes implementation details
9.6.2 Weighted graph, use dictionary of dictionaries
L = {
s:{t:10, y:5},
t:{x:1, y:2, z:8},
x:{z:4},
y:{t:3, x:9, z:2},
z:{s:7, x:6}
}
s
t x
y z
10
1
6
4
9
8
7
5
2
3
2
Vertices V of G: same as before
>>> L.keys()
[t, z, s, x, y]
# Note: could be in another order!
Adjacency list Adj[t]:
>>> L[t].keys()
[x, y, z]
# Note: could be in another order!
Weight of edge: w(s, t)
>>> L[s][t]
10
9.6.3 Wrap data structure in a class
class Graph:
def init (self, L):
self.G = L
def w(self, u, v):
return self.G[u][v]
def Adj(self, u):
return self.G[u].keys()
def V(self):
return self.G.keys()
38
To create a new graph:
>>> g = Graph(L)
>>> g.w(s,t) # w(s, t)
10
>>> g.Adj(y) # Adj[y]
[t, z, x]
>>> g.V() # G[V]
[t, z, s, x, y]
>>>
Note:
g.Adj(y) is adj. list of vertex named y
g.Adj(y) is adj. list of vertex referenced by variable y.
y itself could contain name s, t, x, y, z, . . .
What if you need additional attributes? (e.g., BFS needs
color)
Just use it! need to initialize to { }
>>> g.color = { }
Now BFS can use g.color[v]
39
Lecture 10
Minimum Spanning Trees
This lecture
Quiz today
Min-Spanning Trees (Ch.23): Kruskals and Prims al-
gorithms
some coverage of disjoint sets (Ch. 21) used by
Kruskals
10.1 Min-spanning trees
Spanning tree
of a connected, undirected, edge-weighted graph G
= subgraph with subset of edges that connect all
vertices
# of edges =[V[ 1
minimum spanning tree:
total weight of edges is minimized
edge weights need not be distinct; MST need not be
unique
Example:
a
b c d
e
f g h
i
4
8
11
8 7
9
10
2 1
7 6
2
4 14
a
b c d
e
f g h
i
4
8 7
9
2 1
2
4
Graph
G(V, E)
a minimum spanning tree
a
b c d
e
f g h
i
4
8
7
9
2 1
2
4
another minimum spanning tree
The MST problem has a greedy solution.
Two MST algorithms:
Kruskals algorithm:
start with n trivial forests, add min-cost edge to merge
two forests
Prims algorithm:
grow a single tree by adding min-cost edge connected to
current tree.
10.2 Kruskals algorithm:
make singleton sets (trivial trees) for each vertex.
for edges e = (u, v) in increasing (or nondecreasing
weight) w
if u, v belong to different trees T
1
, T
2
, then
join trees T
1
, T
2
by connecting edge (u, v)
add edge (u, v) to edge set of min-spanning tree A
actually, can stop after having [V 1[ edges
return A.
a
b c d
e
f g h
i
4
8
11
8 7
9
10
2 1
7 6
2
4 14
a
b c d
e
f g h
i
4
8 7
9
2 1
2
4
input Graph G(V, E) sort edges by weight:
g h
f g c i
1
2
4 b a f c
6 g i
7 d c
h a 8 c b
9 e d
10 f e
11 h b
14 f d h i
4
8
8 7
9
2 1
7
6
2
4
Initially: n trivial trees; no edge added;
In increasing order of edge weight,
join if in different trees
add edges from smallest weight to join trees
g h
f g c i
1
2
4 b a f c
6 g i
7 d c
h a 8 c b
9 e d
10 f e
11 h b
14 f d h i
a
b c d
e
f g h
i
edge (i, g) does not join two trees,
therefore cant be added.
4
2 1
2
4
a
b c d
e
f g h
i
g h
f g c i
1
2
4 b a f c
6 g i
7 d c
h a 8 c b
9 e d
10 f e
11 h b
14 f d h i
edge (i, h) does not join two trees after (c, d) has been added;
edge (a, h) does not join two trees after (b, c) has been added
Resulting
MST
This graph has 4 possible MSTs!
40
Algorithm
MST-KRUSKAL(G, w)
1 A / 0
2 for each v V[G]
3 do MAKE-SET(v)
4 sort E by weight w
5 for (u, v) E in nondecreasing order by weight
6 do if FIND-SET(u) ,= FIND-SET(v)
7 then A (u, v)
8 UNION(u, v)
9 return A
disjoint-set data structure (Chapter 21)
Collection S = S
1
, S
2
, . . . , S
k
set of disjoint dynamic
sets
Disjoint: no two sets S
i
, S
j
contain same member.
S
i
S
j
= / 0 whenever i ,= j.
Given x, want to be able to nd the set S
i
that con-
tains it (unique).
MAKE-SET(x):
make a new set S =x, and add it to collection S.
Assumption: x is not a member of another set!!
UNION(x, y):
assume x S
x
, y S
y
, and assume S
x
S
y
= / 0
merges S
x
and S
y
into the same set.
remove S
x
, S
y
from collection S, add the new union
set to S.
FIND-SET(x):
returns (a reference to) the set S that contains x.
Application to Kruskals algorithm:
need to test whether edge (u, v) connects two different
trees
(i.e., whether u, v are in same connected component.
line 6: if FIND-SET(u) = FIND-SET(v)
after adding edge (u, v), need to merges two different
connected components into one
line 8: UNION(u, v)
a b c d e f g h i
a b c d e f g, h i
a b c d e f , g, h i
a b c, i d e f , g, h
a, b c, i d e f , g, h
a, b c, i, f , g, h d e
How to implement disjoint sets?
Linked list
Trees (forest) with path compression
Linked-list implementation
Linked list node:
key
next
representative: head of the linked list
link to tail: for fast linking of two lists
To implement the API:
MAKE-SET: make a linked-list node, add pointer to it-
self as representative. (1) op
FIND-SET: return pointer to representative.
UNION(x, y):
Get tail of xs list X, Get head of ys list Y
append Y to the end of X (easier part)
Update all of Ys representative to point to X
(the harder part!!)
union:
step 1: link
rep
key
next
a b
Suppose we have S = {{a, b}, {c, f, g, h}}
c f g h
a b c f g h
step 2: adjust
Ys repr link
a b c f g h
many representative links to update
(n
2
) time worst case, (n) average
Improvement: Forest with Path compression
binary tree instead of linked list (height grows by lgn)
Lazy update for representative
During FIND-SET, also adjust representative link as a
side effect
a
b
c
f g
h
UNION:
adjust cs pointer,
dont bother with its childrens link
FINDSET(h)
follow path to root,
update link as side effect
a
b
c
f g
h
41
10.2.1 Runtime analysis
Depends on the data structure used!!
O(E lgE +C) where
C = ElogV without path compression,
C = E(E,V) with path compression, where
= functional inverse of Ackermanns function,
very slow growing, almost constant.
Sorting: O(E lgE) for comparison sorting
MAKESET: O(V) to initialize
UNIONs: V 1 time
FIND-SET: 2E, because we do two nds each time.
actually, E lgE = (E lgV):
V 1 E V
2
,
lgV < lgE < 2lgV
Correctness of Kruskals: Greedy
Loop invariant: A is a subset of some min-spanning tree,
where
A is the set of edges added so far
if T = MST of G
Let T
1
, and T
2
be MSTs of subgraphs G
1
and G
2
,
and edge (u, v) with weight w(u, v)
w(T) = w(T
1
) +w(T
2
) +w(u, v) as the only edge con-
necting the two.
There cant be better trees than T
1
or T
2
, or else T would
be suboptimal
10.3 Prims algorithm:
start with a single node,
grow the tree by one edge at a time with the least weight
(the edge must be connect to the current tree were grow-
ing)
PRIM(G, w, r)
1 for u V[G]
2 do key[u]
3 [u] NIL
4 key[r] 0
5 Q V[G]
6 while Q ,= / 0
7 do u EXTRACT-MIN(Q)
8 for v Ad j[u]
9 do if v Q and w(u, v) < key[v]
10 then [v] u
11 key[v] w(u, v)
Complexity
Build Heap: (V)
[V[ times
EXTRACT-MIN: (lgV) bring total up to
(V lgV)
DECREASE-KEY: total of O(E) times,
when key is decreased, need to adjust its position
in heap O(lgV) time.
(V lgV +E lgV) = (E lgV).
Correctness of Prims
Similar argument, loop invariant is A is subset of some MST
Defn: safe edge for A is an edge (u, v) such that
A(u, v) is also a subset of some MST.
Kruskals nds a safe edge by connecting two MSTs
T
1
, T
2
with a min-weight edge between them.
Prims nds a safe edge by connecting the main tree to a
trivial (single vertex) MST with a min-weight edge.
T
1
, T
2
two distinct MSTs, u T
1
, v T
2
, and w(u, v) is
minimum of such edges. Then T
/
= T
1
(u, v)T
2
is
also an MST.
Closing remarks:
Kruskalls log
2
E for parallel massive
Prims harder to parallelise, good for smaller # of pro-
cessors.
42
Lecture 11
Single Src Shortest Paths
This lecture
DIJKSTRAs: O(E lgV), greedy, no negative cycle al-
lowed
BELLMAN-FORD: O(VE), dynamic programming, han-
dles negative cycles
constraint solving with Bellman Ford
Single-Source Shortest Paths
One starting vertex, nd shortest paths to all other ver-
tices
same as single-destination shortest-paths problem (just
reverse edge directions)
Single-pair u v shortest path: same complexity as
single-source u all vertices
All-pairs shortest path (next time)
More general than BFS variable weights
11.1 DAG shortest paths
DAG: directed acyclic; no cycle allowed.
Initialize all d[v] = , set d[s] = 0.
Visit vertices in topological order;
(a vertex is eligible only if all of its predecessors have
been visited)
when visiting u, call RELAX(u, v) on all v Adj[u]
Time complexity: (V +E).
RELAX
What is the shortest way to go from s to v?
d[v] is the closest distance we know from s to v so far.
if going through u to v is shorter, then go through u.
RELAX(u, v, w)
1 if d[v] > d[u] +w(u, v)
2 then d[v] d[u] +w(u, v)
3 [v] u
r
s
t
x
y
z
5
3 4
2
1 6
2 7
1
2
: 0
r
s
t
x
y
z
5
3 4
2
1 6
2 7
1
2
: 0 2 6
r t
x
y
z
5
3 4
2
1 6
2 7
1
2
: 0 2 6 6 4
s
r
x
y
z
5
3 4
2
1 6
2 7
1
2
: 0 2 6 5 4
s
t
r y
z
5
3 4
2
1 6
2 7
1
2
: 0 2 6 5 3
s
t
x
Result:
s r: (not reachable)
s t: 2
s x: 6
s x y: 5
s x y z: 3
11.1.1 Notation
d[v] = distance from source. d[s] = 0.
(u, v) = shortest path distance from u to v.
goal: d[v] = (s, v) for every node in the graph.
Correctness:
by the time v visited, all of its predecessors u have called
RELAX(u, v).
This means when v is visited, d[v] = min
u
d[u] +
w(u, v).
Inductive hypothesis was d[u] = (s, u) (shortest)
d[v] =min
u
(s, u)+w(u, v) =(s, v) because there are
no other ways to go from s to v.
43
11.2 Dijkstras
Almost like PRIMs:
identical complexity to Prims
no negative cycle allowed
Dijkstras is directed, vs. Prims undirected
idea: start with a single node, grow the tree by one edge
at a time with the least weight
11.2.1 Algorithm:
initialize all d[v] ;
initialize all [v] NIL
d[s] 0 source node
S ; empty set
Build a heap based on the key d[]
main loop
4 while Q ,= / 0
5 do u EXTRACT-MIN(Q)
6 S Su
7 for v Adj[u] (i.e., edge (u, v))
8 Prims: nd closer way to reach neighbor
if free[v] and w(u, v) < key[v] then
key[v] w(u, v)
pred[v] u
8 Dijkstras:
if d[v] > d[u] +w(u, v) then RELAX(u, v, w):
d[v] d[u] +w(u, v) use the shorter
one.
[v] u
need to restore heap property after each weight change!
11.2.2 Example:
s
t x
z y
10
5
2
1
2
9
6
7
4 3
Start from root = s
s t x y z
0
Q

dequeue, Relax over Adj[s]
s
t x
z y
10
5
2
1
2
9
6
7
4 3
y t x z
Q
5 10
dequeue, Relax over Adj[y]
s
t x
z y
10
5
2
1
2
9
6
7
4 3
t x z
Q
8 14 7
dequeue, Relax over Adj[z]
s
t x
z y
10
5
2
1
2
9
6
7
4 3
t x
Q
8 13
dequeue, Relax over Adj[t]
s
t x
z y
10
5
2
1
2
9
6
7
4 3
x
Q
9
Were actually done here, though
Dijkstras algorithm would call
Relax over Adj[x].
It shouldnt change anything.
s
0
s
0
y
5
s
0
y
5
z
7
s
0
y
5
z
7
t
8
Result:
s s: 0
s y: 5
s y z: 7
s y t: 8
s y t x: 9
S
S
S
S
S
11.2.3 Correctness proof sketch:
[Theorem 24.6:] whenever u is added to S, d[u] = (s, u).
Proof:
1. initialization: S = / 0. trivially satised.
2. loop invariant: d[v] = (s, v) for each v S
3. Also, d[v] (s, v) for any v at all times.
4. Let u, y be two vertices NOT
YET added to S
s
x
S V S
u
y
5. if we add u before y, and there is a path y u, then d[y]
must have converged (i.e., = (s, y))
d[u] (s, u) reason (3)
= (s, y) +(y, u) reason (5)
= d[y] +(y, u) substitute; convergence
d[y] no negative weight allowed.
44
But d[u] d[y] means when we did EXTRACT-MIN, y is
smaller and should have been chosen rst.
contradiction: we assumed u was dequeued before y.
11.2.4 What about negative weights?
s
t
y
10
5
6
Start from root = s
s t y
0
Q

dequeue, Relax over Adj[s]
s
t
y
10
5
6
y t
Q
5 10
dequeue, Relax over Adj[y]
s
t
y
10
5
6
t
Q
10
dequeue, Relax over Adj[z]
s
0
s
0
y
5
S
S
S
s
t
y
10
5
6
t
Q
10
s
0
y
4
S
didnt converge!
z
10
z
10
z
10
z
10
z

z
15
z
15
Dijkstras terminated with z = 15 (via s y z),
even though (s, z) = 14 (via s t y z)
11.3 Bellman-Ford
Allows negative edge weight
Can detect negative cycles!
11.3.1 Algorithm:
BELLMAN-FORD(G, w, s)
1 INITIALIZE-SINGLE-SOURCE(G, s)
that is, d[u V] ; d[s] 0
2 loop [V[ 1 times
3 for edge (u, v) E order does not matter!
4 RELAX(edge (u, v))
now check if we converged
5 for edge (u, v) E
6 do if RELAX(u, v) still havent converged after V 1 times!
7 raise Negative Cycle;
if converged (test dirty bit) then return to make it faster!
11.3.2 Correctness:
Need to prove d[v] = (s, v) after [V[ 1 passes if no negative
cycle.
Lemma: d[v] (s, v)
Proof by contradiction:
(1) Suppose d[v] < (s, v), and edge (u, v) exists, and
(2) Suppose d[v] = d[u] +w(u, v)
d[v] < (s, v) assumption (1) to contradict
(s, u) +w(u, v) by triangular inequality
d[u] +w(u, v) assumption that v is rst violation.
But this contradicts assumption (2) d[v] = d[u] +w(u, v).
Lemma (stated informally)
s v
1
v
2
v
3
. . . v
initially d[s] = 0 is correct.
after 1 pass, d[v
1
] is correct, so d[v
1
] = (s, v
1
)
after 2 passes, d[v
2
] is correct....
11.3.3 Complexity: O(VE)
observation: no Priority Queue
Compare: DAG shortest paths
Relax in (any) topologically sorted order. just 1
pass.
O(V +E)
idea: use strongly connected component graphs to
speed up Bellman Ford.
11.4 Bellman-Ford: special case of linear pro-
gramming
conjunctive pairwise max-difference constraints. For example:
x
1
x
2
3
x
2
x
3
2
x
1
x
3
2
Want
feasibility check negative cycle not feasible
solution to the variables x
i
.
11.4.1 Bellman Ford as a Constraint Solver!
one vertex per variable, plus anchor s
one edge per constraint (including redundant constraint
elimination)
x
1
x
2
5 would be redundant! ( 3 constraint is
stronger)
Run Bellman-Ford, then solution X = ((s, x
1
), (s, x
2
), . . .)
is a feasible solution.
45
s
t x
z y
6
7
8
5
9
3
7
2
4
2
Relaxation order: over Adj of
From: t: x: y: z: s: 0
To: x: y: z: t: x: z: x: s: 0 t: 6 y: 7
1st
pass:
s
t x
z y
6
7
8
5
9
3
7
2
4
2
2nd
pass:
From: t: 6 x: 11 y: 7 z: 2 s: 0
To: x: 11 y: 7 z: 2 t: 6 x: 4 z: 2 x: 4 s: 0 t: 6 y: 7
keep the shorter path keep the shorter path
s
t x
z y
6
7
8
5
9
3
7
2
4
2
3rd
pass:
From: t: 6 x: 4 y: 7 z: 2 s: 0
To: x: 4 y: 7 z: 2 t: 2 x: 4 z: 2 x: 4 s: 0 t: 6 y: 7
d[s] didnt change,
so nothing happens
to these
d[t], d[y], d[z], d[s] didnt change
since the previous iteration, so nothing
happens to these.
s
t x
z y
6
7
8
5
9
3
7
2
4
2
4th
pass:
From: t: 2 x: 4 y: 7 z: 2 s: 0
To: x: 4 y: 7 z: 2 t: 2 x: 4 z: 2 x: 4 s: 0 t: 6 y: 7
keep the shorter path
Converged after 4th pass
46
Lecture 12
All Pairs Shortest Paths
This lecture
Single source contd: special case of linear program-
ming, constraint solver
All-pairs shortest paths
Floyd-Warshall: O(V
3
) and Transitive closure (War-
shalls)
Johnsons algorithm
12.1 Special case Linear Programming (contd)
ab 2 as edge (b, a) with weight = 2
ac 1 as edge (c, a) with weight = 1
bc 3 as edge (c, b) with weight =3
Finally, add a source vertex s and edges
(s, a), (s, b), (s, c) with weight 0.
s
a
b
c
0
0
0
2
1
3
Solve this with BELLMAN-FORD, we get
a =1 (path s c b a)
b =3 (path s c b)
c = 0 (path s c)
check our answers:
(1) (3) 2 ? yes
(1) (0) 1 ? yes
(3) (0) 3 ? yes
12.1.1 Correctness sketch:
Key point: triangle inequality: (s, v) (s, u) +w(u, v)
By construction, d[v] = min over all such (u) +w(u, v)
Min over all means these properties hold conjunctively.
If there is a negative cycle, then
(w(i +1%n, i) < 0
But algebraically, x
(i+1)%n
x
i
= 0.
since 0 cant be < 0, there exists no feasible solution.
BELLMAN FORD nds the best (i.e., minimizes)
(maxx
i
minx
i
)
12.2 All Pairs shortest path
First attempt: DIJKSTRAs or BELLMAN-FORD once on
each vertex as s
BELLMAN FORD good for sparse graph; but
for dense graph, O(V
2
E), about O(V
4
) for dense
graphs
FLOYD-WARSHALLs assumes adjacency matrix
JOHNSONs for sparse, adjacency list
12.2.1 Formulation
Single-source All-pairs
curr best d[u] d[i, j]
optimal (s, u) (i, j)
on termination d[u] = (s, u) d[i, j] = (i, j)
relax: RELAX(u, v) RELAXPAIR(i, k, j)
try s u v try i k j
data struct adj. list? adj. matrix
Adjacency matrix representation
matrix W =
_
w
i j
_
m
th
iteration: W
(m)
=
_
w
(m)
i j
_
47
Recursive solution
matrix L =
_
l
i j
_
l
(m)
i j
= minimum path weight i j containing at most m
edges
l
(0)
i j
=
_
0 if i = j,
if i ,= j
for m
th
iteration, want to RELAXPAIR(i, k, j)
if i k (m1 steps) + k j (in 1 step) is shorter than
the current best i k ((m1) steps), then take path
i k j.
l
(m)
i j
= min
_
l
(m1)
i j
curr. best in m1 steps
min
k1..n
l
(m1)
ik
+w
k j
m1 steps to k, then (k, j)
= min
1kn
_
l
(m1)
ik
+w
k j
_
We know we can reach any node from any other node in
V 1 edges. This means
Loop V 1 times (V)
for k 1 to [V[ (V)
for all i, j V
2
times
if l
i j
> l
ik
+w
k j
then shorter to go thru k
l
i j
l
ik
+w
k j
.
Total is O(V
4
) times essentially same as BELLMAN
FORD.
12.2.2 Repackage this as a matrix structure:
L
(m)
= L
(m1)
W looks a lot like a matrix multiply C = A B:
EXTEND PATH MATRIX MULTIPLY
L
(m)
= L
(m1)
W C = A B
l
ik
+w
k j
a
ik
b
k j
l
i j
min(l
i j
, l
ik
+w
k j
) c
i j
c
i j
+a
ik
b
k j
EXTENDPATH(L,W)
1 for i 1 to n
2 for j 1 to n
3 l
i j

4 for k 1 to n
5 l
i j
min(l
i j
, l
ik
+w
k j
)
6 return L
MATRIXMULTIPLY(A, B)
1 for i 1 to n
2 for j 1 to n
3 c
i j
0
4 for k 1 to n
5 c
i j
c
i j
+a
ik
b
k j
6 return C
Why do it this way? can be faster
matrix-mult is associative, so we can do repeated squar-
ing:
W
2k
=W
k
W
k
.
so we just need to do this lgn1| times to compute
W
(n1)
.
Ok to overshoot because it should converge, unless there
is a negative cycle.
Complexity: (V
3
lgV): each matrix multiply is (V
3
),
(lgV) squarings.
12.2.3 Floyd-Warshall: O(V
3
)
quintessential dynamic programming, bottom up
Observation: wasteful to just consider expanding the dis-
tance by one edge segment at a time.
instead of of w
ik
, why dont we use d
ik
from the previous
iteration?
d
(m1)
ik
(path) grow by
REPEAT SQUARING i k k j (edge)
FLOYD-WARSHALL i k k j (path)
d
(k)
i j
=
_

_
w
i j
if k = 0
min
k1..n
_
d
(k1)
i j
,
d
(k1)
ik
+d
(k1)
k j
_
if k 1
12.2.4 Algorithm - very simple!
FLOYDWARSHALL(W)
D
(0)
W
for k 1 to n
for i 1 to n
for j = 1 to n
d
(k)
i j
= min(d
(k1)
i j
, d
(k1)
ik
+d
(k1)
k j
)
return D
(n)
as a side effect, record predecessor when taking min
in an implementation, can just reuse the same space!!!
(it wont hurt to use d
(k)
i j
when we need d
(k1)
i j
.
O(n
3
) time, O(n
2
) space
12.3 Transitive closure: E

(Warshalls)
Boolean yes/no question:
is there a path from i to j?
very useful for transitive relations:
(e.g., a > b, b > c, then a > c)
(e.g. a b, b c, then a c)
Question becomes is there a directed path?
nite path yes; innite no.
can just replace min with AND, + with OR.
yes TRUE (or 1), no FALSE (or 0)
FLOYDWARSHALL(W)
D
(0)
W
for k 1 to n
for i 1 to n
for j 1 to n
d
i j
min(d
i j
, d
ik
+d
k j
)
return D
(n)
TRANSITIVECLOSURE(G)
T
(0)
Adj[G]
for k 1 to n
for i 1 to n
for j 1 to n
t
i j
t
i j
(t
ik
t
k j
)
return T
(n)
48
t
i, j
= (i, j) E ? true : false;
t
i,i
= true
t
(k)
i j
t
(k1)
i j
(t
(k1)
ik
t
(k1)
k j
) means
if (i, j) not already E, and we have both (i, k) and
(k, j) E, then add (i, j) to E.
Summary: Floyd-Warshall use adj matrix, better for dense;
but for sparse, want the E factor closer to V.
12.4 Sparse graphs: Johnsons algorithm
Idea: run DIJKSTRAs if no negative weight (preprocess-
ing pass to check)
But if there are, do reweighing add an offset to shift
the weights up to nonnegative.
Cant just add a constant weight to each edge.
why not?
because path length is a separate concept from path
count.
could screw up path length by (difference in path count)
(edgeweight offset to make everybody positive).
u v
w
5
2 3
x
1
u v
w
7
0 5
x
1
shift all
weights up
by 2 (to
eliminate
negative
weights)
shortest:
uvwx: 2
shortest (incorrect):
ux : 5
idea:
add a new source vertex s to all other vertices in G
weight = 0 on all these new edges.
Run BELLMAN-FORD to compute d[v] for all in G.
use d[v] from BELLMAN-FORD as the new offset func-
tion h(v).
Reweighing each edge (u, v) as w(u, v) = w(u, v) +
h(u) h(v)
for each u, run DIJKSTRAs on this reweighed graph
shift weight back:

(u, v) +h(v) h(u).
12.4.1 Proof weight works:
need to show w(p) = (v
0
, v
k
) w(p) =

(v
0
, v
k
).
basically along any path, all intermediate ones cancel
(telescope series)
w(v
1
, v
2
) + w(v
2
, v
3
) + . . .
= w(v
1
, v
2
) +h(v
1
) h(v
2
) +w(v
2
, v
3
) +h(v
2
) h(v
3
) + . . .
= w(v
1
. . . v
n
) +h(v
1
) h(v
n
).
49
Lecture 13
Flow Networks
This lecture
Flow Networks
FORD-FULKERSON method
EDMONDS-KARP algorithm
13.1 Flow network
directed graph G = (V, E)
source s, sink t
capacity on edge, c(u, v) 0.
s t
u
w
v
x
16
13
10 4
12
14
9
7
4
20
two views: net ow vs. positive ow
Book uses net ow f : V V R
positive ow G is p : V V R
p(u, v) =
_
f (u, v) if f (u, v) > 0
0 if f (u, v) 0
Basically f (u, v) = p(u, v) p(v, u); f (u, v) =f (v, u)
Example:
p(w, x) = 11; p(x, w) = 0 because net ow is 11.
f (w, x) = 11; f (x, w) =11 = p(x, w) p(x, w) =
11.
s t
u
w
v
x
11/16
8/13
0/10 1/4
12/12
11/14
4/9 7/7
4/4
15/20
Three properties:
capacity constraint: 0 p(u, v) c(u, v)u, v V
skew symmetry: f (u, v) =f (v, u)
ow conservation: u V s, t :

vV
: p(u, v) =

vV
p(v, u)


vV
: f (u, v) = 0
e.g., u: p(u, v) + p(u, w) = p(s, u) + p(w, u)?
12+0 = 11+1 yes.
e.g., w: p(w, u) + p(w, x) = p(s, w) + p(u, w) + p(v, w)?
1+11 = 8+0+4 yes.
This means Inow(u) = Outow(u) for all except source
and sink. Kirchhoffs current law.
Defn: value of ow value( f ) = Outow(s) Inow(s) =
Inow(t) Outow(t)
Goal: maximize value( f ). Could be less.
Example: s
5

x
4

t
max out at ow of 4 (min of capacities along any seg-
ment)
(x, y) is the bottleneck
Example:
50
s t
u
y
v
x
5/5
5/6
3/5
0/2
2/4
5/5
1/1
7/7
4/6
2/5
w
z
2/4
3/3
2/4
this flow has
value |f| = 13
Can do better:
s t
u v
x
2/
4/4
7/7
2/
4/5
w
3/
1/3
3 /
5/5
2/
4/4
an
augmenting
path
Note: were actually reducing f(u, x) in
order to increase overall flow
Residual network G
f
include only edges w/ nonzero capacities.
c
f
(u, v) = c(u, v) f (u, v) for forward edge,
c
f
(v, u) = f (u, v) for added back edge.
example: G is u
3/5
v, v
0/1
u then
v
3/5
u v
c(u, v) = 2 = 5 3
u
G
f
G
c(v, u) = 4 = 1 + 3
0/1
G
f
is u v (2 capacity, after subtracting ow 3
from 5 capacity, because we have 2 capacity left)
and v u is (4 capacity, because its capacity 1
minus ow of 3 in opposite direction = 4.)
Augmenting path: p
a path p from s to t in ow can be increased by c
f
(p) =
minc
f
of any edge on path p.
[E
f
[ 2[E[.
12
s t
u
w
v
x
11/16
8/13
0/10 1/4
12/12
11/14
4/9 7/7
4/4
15/20
G
s t
u
w
v
x
11/16
8/13
5
0/10
11
1/4
3
12/12 0
11/14 3
4/9
5
7/7
4 / 4
15/20
5
G
f
5 11
8
11
7
15
4
4
s t
w
v
5
5
Augmenting path in residual network
4
s t
u
w
v
x
11/16
8/13
12/13
0/10 1/4
12/12
11/14
4/9
0/9
7/7
4/4
15/20
19/20
Flow improvement:
original graph G has ow f ,
residual graph G
f
has ow f
/
Improved ow =[ f + f
/
[ =[ f [ +[ f
/
[ (i.e., just add resid-
ual graphs ow)
Max-ow, min cut: (S, T)
Defn: a Cut of a ow network is a partition of V into
(S, T) such that s S, t T,
(or, look at as a set of edges that, when removed, discon-
nects the graph).
s t
u
w
v
x
11/16
8/13
0/10 1/4
12/12
11/14
4/9 7/7
4/4
15/20
S T
Example: Green edges form a cut.
capacity of a cut (S, T) =
c(S, T) =

uS,vT
c(u, v)
Example: c(S, T) = 12 + 14 = 26 (not considering
(v, w)!! dont do 9)
ow across the cut (S, T) is f (S, T) that is same as the
whole networks ow f !!!
Example: f (S, T) = 12+(4) +11 = 19
same as Outow(s), Inow(t)
Theorem
1. f is a maximum ow
2. f contains no augmenting paths
3. [ f [ = c(S, T) for some cut (S, T).
Proof:
(1) f maxow (2) f has no aug. paths
trivial: because if there exists an aug. path, we can in-
crease ow by f
/
(which means it wasnt max)
(2) no aug. paths (3)[ f [ = c(S, T)
no augmenting path means residual graph has no path
from s to t.
S = sv : path s
p

v, v T
f (u, v) = c(u, v)foru S, v T
otherwise c
f
(u, v) = c(u, v) f (u, v) > 0,
v should have been in S in the rst place.
[ f [ = f (S, T) = c(S, T)
51

(3)[ f [ = c(S, T) (1) f is maxow


because [ f [ = f (S, T) c(S, T) over all cuts (S, T)
and = is as large as f gets, therefore max.
13.1.1 Maxow methods:
Ford-Fulkerson method: nd augmenting path.
Edmonds-Karp algorithm (uses BFS)
13.2 FORD-FULKERSON
f [all edges(u, v)] 0
while exists path p from s t in G
f
(residual network):
augment f by c
f
(p) : namely
f [u, v] = f [u, v] +c
f
(p)
f [v, u] =f [u, v]
Note:
backward edges get introduced,
Forward edges drop out after they reach capacity.
Runtime:
O(E[ f

[) where f

is max ow assuming integer capac-


ities! (value dependent!!)
because O(E) to nd path
s t
u
v
1,000,000
1,000,000
1
1,000,000
1,000,000
s t
u
v
999,999
1,000,000
1
1,000,000
1
999,999
1
s t
u
v
999,999
999,999
1
999,999
1
999,999
1
1
1
13.3 EDMONDS-KARP
Key idea:
use BFS on residual graph
use the rst s t path found (i.e. shortest, fewest seg-
ments)
s t
u
w
v
x
16
13
10 4
_12
14
9 7
4
20
= 1
s t
u
w
v
x
12/16
_/13
_/10 _/4
12/12
_/14
_/9 _/7
_/4
12/20
s t
u
w
v
x
4
13
10 4
14
9 7
4
8
12
12
12
s t
u
w
v
x
12/16
4/13
_/10 _/4
12/12
4/14
_/9 _/7
4/4
12/20
s t
u
w
v
x
4
9
10 4
10
9 7
8
12
12
12
4
4
4
s t
u
w
v
x
12/16
11/13
_/10 _/4
12/12
11/14
_/9 7/7
4/4
19/20
s t
u
w
v
x
4
2
10 4
3
9 7
1
12
12
19
11
11
4
no more BFS paths from
s to t
= 2 = 3
Complexity:
O(VE
2
), = E O(VE) augmentations.
An edge (u, v) in s t augmenting path p is critical if
c
f
(u, v) = c
f
(p).
after augmentation, (u, v) no longer appears in the next
residual graph!! (because its saturated)
Since
f
(s, v) is always [V[,
(u, v) is critical at most [V[/2 times.
total number of (u, v)s considered 2[E[
13.4 Maximal bipartite matching.
undirected bipartite graph G = (V, E),V = L R, E
LR,
a matching M G is a set of edges M E such that
no two edges in M share an endpoint
A maximal matching iff every e E M, Me
is not a matching
L R
perfect matching iff M touches every vertex:
if G is bipartite then [M[ =[L[ =[R[
claim: maximum bipartite matching is a special case of
network ow.
Add a source, a sink, with innite capacity, do Maxow
s t
52
Lecture 14
NP-Completeness
Problem complexity (not algorithm complexity)
Decision problems: outputs Yes/ No
Verication problems: check a solution instance for
Yes/No (guess and check)
certicate = the test case that will show sufcient condi-
tion.
complexity class P (decision in polynomial time),
NP (guess-and-verication in polynomial time)
NP, NP-complete, NP-hard
Reduction: different problems using the same algorithm!
14.1 Other Graph problems
EULER HAMILTONIAN TRAVELING
Tour Cycle SALESMAN
visits each visits each visits each
edge vertex vertex once
exactly once exactly once w/ least cost
O(E) hard hard
k-CLIQUE k-VERTEX COVER
a subgraph that is subset V
/
V
a complete graph that touches all edges,
w/ k vertices [V
/
[ = k
hard hard
14.1.1 Euler Tour
start 1
2
Graph G
1
Euler tour for G
1
(visits each edge once,
doesnt have to close
the cycle; can visit
a vertex multiple times)
3 4
5
6
7
8
9
Graph G
2
can you find an
Euler tour in this
graph??
Easy to decide! O(E) time (page 559, problem 22-3)
in-degree(v) = out-degree(v)v V.
Corollary: either all v V have even in-degrees or exactly
two vertices have odd in-degrees.
14.1.2 Hamiltonian cycles
simple cycle that visit each vertex exactly once.
Cycle must end on the same vertex as start.
Graph G
1
can you find a
Hamiltonian
cycle?
Graph G
2
has a
Hamiltonian
cycle as shown.
Cant visit a vertex more than once!!
No known efcient algorithm (polynomial time)
14.1.3 Traveling Salesman
input: Complete graph, weighted
output: lowest cost cycle
example: u
1

w
2

v
1

x
3

u
3
5
2
4
1
1
u v
w x
Easy to nd a cycle in a clique, but hard to nd
minimum-weight cycle
But if we dont need the minimum, we can nd good
cycles in polynomial time (approximation algorithm)
14.1.4 Constructive vs. Decision vs. Verication
Problems
Constructive: produce the solution (hardest)
e.g., show me a hamiltonian cycle in G
Decision: Yes/No, Can/Cannot (easier)
e.g., does G have a hamiltonian cycle?
Verication: Yes/No is a given solution valid? (easiest)
e.g., is path p a valid hamiltonian cycle for G?
certicate = the alledged solution to be veried
e.g., the path p to be veried as a hamiltonian cycle
53
14.2 Problem complexity
P: set of problems solvable in polynomial time. O(n
k
)
example: sorting, single-source shortest path
example: verify if p is a hamiltonian cycle in G.
NP: set of decision problems that can verify a certicate
(guessed solution) in polynomial time.
NP does not mean not polynomial!!
N means nondeterministic (as in nondeterminis-
tic nite automaton)
must be a DECISION PROBLEM!!!!!!!!!!!!!!
a decision problem is NP
the corresponding constructive problem is hard
polynomial time solution is unlikely to exist
14.2.1 Formal Languages
alphabet = nite set of symbols
language L over = set of strings whose characters are

example: =0, 1, L =10, 11, 101, 111, 1011, . . .
empty string

= set of all strings over


example:

=, 0, 1, 00, 01, 10, 11, 000, . . .


Operators on Languages
union: L
1
L
2
intersection: L
1
L
2
complement: L =

L
example: L =10, 11, 101, 111, 1011
L =, 0, 1, 00, 01, 000, 001, 100, . . .
concatenation: L
1
L
2
=x
1
x
2
: x
i
L
i

example: L
1
=10, 11, L
2
=00, 01
L
1
L
2
=1000, 1001, 1100, 1101
self-concatenation: L
k
= LLL. . . for a total of k times
L

= closure of L =LL
1
L
2
L
3

14.2.2 Decision problem in Formal language
x

= instance of input to decision problem


i.e., x is a concrete representation (string encoding) for
the certicate.
Q(x) a decision function:
Q(x) =
_
1 if x is a valid certicate
0 if x is invalid (rejected)
L =x

: Q(x) = 1
i.e., L is the set of valid certicates to a decision problem
Q.
Example
= ASCII character set

= all possible ASCII strings;


L
P

valid Python programs


L
G
= G) L
P
valid encodings for graphs (in
dictionary-of-adjacency syntax)
e.g., L
G
= {a: [b, c],
b:[a, c]}, {a: [c],
b:[a, d]}, . . .
x

= string; can restrict x to a G) (graph syntax)


Q(x) returns 1 if x has a hamiltonian cycle, 0 otherwise
Q can be viewed as a language L over whose mem-
bers are (encodings for) those graphs with hamiltonian
cycles.
Note: since ASCII (and any character set) can be encoded
in binary, we can always use =0, 1.
14.2.3 Decision in polynomial time
For an algorithm A,
A acccepts string x if A(x) outputs 1
(e.g., x as a graph)
A rejects string x if A(x) outputs 0
A decides language L if
for every x L, A(x) terminates with 1
for every x , L, A(x) terminates with 0
(e.g., L as a set of graphs with hamiltonian cycles)
A accepts language L in polynomial time if A(x) =1x
L, [x[ = n, and A terminates in O(n
k
) time for constant k
But A doesnt have to worry about rejecting strings
, L
A might not terminate on a string x , L!!
A decides language L in polynomial time if
A(x) always halts with 1 if x L, and
0 if x , L, both in O(n
k
) time (where n = [x[, string
length).
54
14.2.4 P, NP, co-NP in terms of formal languages
P = L : algorithmA that decides L in polynomial time
= set of languages (inputs) decidable in polynomial
time
= set of problems with known polynomial-time solu-
tions
NP = L : algorithm A : x L, certicate y, [y[ =
O([x[
c
) : A(x, y) = 1
= set of languages that are veriable in polynomial
time
= set of problems whose inputs x are veriable with
certicate y by verication algorithm A in polyno-
mial time.
P NP.
co-NP: L NP.
Open question: is L NP?
P NP co-NP.
However, unsure P = NP co-NP
14.3 Reduction
Problem A reduces to problem B
want to solve A(x)
use a preprocessing function f (x) on input
solve A(x) by calling algorithm B( f (x)) as a sub-
routine
A is no more difcult than B; B is at least as general
as A.
Example from last time: MAXIMAL BIPARTITE
MATCHING reduces to MAXFLOW
Preprocessing the input G: adding source, adding
sink, add innite-capacity edges to form G
/
Run MAXFLOW(G
/
)
s t
input
algorithm for
Max Bipartite Matching
output
p
r
e
p
r
o
c
e
s
i
n
g
add s, t, and -capacity edges
algorithm
for
Max Flow
p
o
s
t
p
r
o
c
e
s
i
n
g
Other examples: PAIRWISE CONJUNCTIVE MAX-
DIFFERENCE reduces to BELLMAN-FORD
L
1

p
L
2
: L
1
is polynomial-time reducible to L
2
if a polynomial-time function f such that
x L
1
iff f (x) L
2
.
for reduction of Decision problems, trivial post-
processing (0/1 output)
14.4 NP, NP-complete, NP-hard
Memorize this now: NP NP-Complete NP-Hard
NP: (review)
set of decision problems: guess and verify in poly-
nomial time
example: does G have a hamiltonian cycle?
NP-Hard:
L NPH: L
/
NP L
/

p
L
i.e., an NP-hard algorithm can be called to solve
any NP problem,
after the input is preprocessed in polynomial time.
L itself might actually run in exponential time!!!!
(e.g., the constructive hamiltonian problem)
L could also be in NP, but need not be in NP.
NP-Complete = NP NPH
1. L NP
2. L is NP-Hard
it turns out the hamiltonian-decision is NP-hard
(no need to use the constructive hamiltonian)
since hamiltonian decision is NP and NPH NPC
14.4.1 How to show L NP-Complete?
1) rst show L NP
easy: showability to verify certicate in polynomial time
2a) show for all L
/
NP reduce to L in polynomial time.
(recall: all of NP NPC NPH)
hard! How to cover all of NP? (could be many)
2b) Alternative: if there is a known L
/
NPC, show L
/

p
L
since NP
p
L
/
( NPC)
p
L
p
NPH
So... we need to
Figure out the rst L NPC by showing (1), (2a) hard.
(CIRCUIT-SAT problem)
For all other L we want to show NPC, do (1), (2b) eas-
ier.
(Circuit Sat
p
L)
if any NP-complete problem is P, then P = NP.
55
14.4.2 Examples Circuit Satisability:
input: combinational logic (p. 989)
Q: is there an assignment of 0, 1 to inputs so it outputs
1?
OR
NOT
AND
x1
x2
x3
1
1
1
1
1
1
1
0
To show CIRCUIT-SAT is NP-complete:
1) Show its NP:
easy: someone gives us circuit and answer, we verify.
(actually could be linear)
2a) Show its NP-hard:
assume a decision algorithm A for any problem
in NP.
Construct a new combinational circuit C by un-
rolling!!!!
combinational
circuit
M
output
next
state
input
current
state
latch
M M
init
state
next
state
curr.
state
next
state
... M
curr.
state
next
state
...
0/1
new circuit C obtained by inlining a polynomial
number of copies of M
machine that contains verification algorithm A that
is decomposed into circuit M and latch
call CIRCUIT-SAT(C) to effectively execute A:
returns 1 iff A(x) = 1; returns 0 iff A(x) = 0.
14.4.3 NP completeness proofs
No need to show EVERY L
/

p
L for all L
/
NP.
only need to show polynomial-time reduction from one
L
/
that is known to be NP-complete.
14.5 NPC Proof for (Formula) Satisability
problem.
allow (AND), (OR), (NOT),
(implies): a b means ab,
(if and only if):a b means (ab) (ab))
easy to show equivalent to the combinational logic prob-
lem.
Example: = ((x
1
x
2
) ((x
1
x
3
) x
4
)) x
2
Is a satisable formula? the SAT problem
SAT is NP-complete.
Show (1) SAT NP and (2b) an existing NPC prob
p
SAT.
(1) easy: guess assignment, verify:
example: certicate for above is x
1
, x
2
, x
3
, x
4
) =
0, 0, 1, 1) verify: ((0 ) ((0 1) 1)) 0 =
(1. . .) 1 = 1
(2b) CIRCUIT-SAT is known to be NP complete; lets use it
to show CIRCUIT-SAT
p
SAT.
To solve CIRCUIT-SAT using SAT:
Note: is equality in binary.
AND-gate: (output (input1 input2))
OR-gate: (output (input1 input2),
NOT-gate: (output input)
Example:
OR
NOT
AND
x
1
x
2
x
3
x
5
x
6
x
7
x
8
x
9
x
10
= x
10
(x
4
x
3
)
(x
5
(x
1
x
2
))
(x
6
x
4
)
(x
7
(x
1
x
2
x
4
))
(x
8
(x
5
x
6
))
(x
9
(x
6
x
7
))
(x
10
(x
7
x
8
x
9
)))
14.5.1 3-CNF satisability (3-SAT)
Motivation: a restricted form of formula:
equally expressive as general SAT
polynomial time translation to/from general SAT for-
mula
polynomial time transformation from circuits
want to show CIRCUIT-SAT
p
3-SAT
p
SAT
by transitivity, CIRCUIT-SAT
p
SAT
3-CNF:
CNF (restricted form) = Conjuctive Normal Form: =
AND of Clauses
3-CNF: Each clause is an OR of 3 different literals
each literal is either a variable or its complement (NOT)
form.
example: (abc) (ad e) (c e f )
bag of tricks: boolean identities:
56
AND: o (ab)
a b o = o ab
0 0 0 1
0 1 0 1
1 0 0 1
1 1 0 0
0 0 1 0
0 1 1 0
1 0 1 0
1 1 1 1
So,
= (a b o) (a b o) (a b o) (a
bo)
Use DeMorgans laws:
= (abo) (abo) (abo) (a
bo)
can construct OR gates similarly
inserting identities if needed to t 3CNF:
e.g, (ab) = (ab p) (abp)
14.6 NPC proof contd: k-clique problem
(CLIQUE)
input: undirected graph G
denition: clique = subset of V that form a complete sub-
graph
k-clique: complete graph with k vertices
problem: does a clique of size k exist in G?
It is NP Complete
1) NP: obvious: guess and verify.
2b) NP-hard: show reduction 3-CNF-SAT
p
CLIQUE
for each Clause C
i
= (a, b, c),
make 3 vertices a
i
, b
i
, c
i
for group C
i
link each literal to all except its complement in
other clauses.
add edges (a
i
, x
j
) if i ,= j, x ,=a
example: =C
1
C
2
C
3
= (x
1
x
2
x
3
) (x
1
x
2
x
3
) (x
1
x
2
x
3
)
x
1 x
2
x
3
C
1
= x
1
x
2
x
3
x
1
x
2
x
3
C
2
= x
1
x
2
x
3
x
1
x
2
x
3
C
3
= x
1
x
2
x
3
x
1 x
2
x
3
x
1
x
2
x
3
x
1
x
2
x
3
x
1 x
2
x
3
x
1
x
2
x
3
x
1
x
2
x
3
x
1 x
2
x
3
x
1
x
2
x
3
x
1
x
2
x
3
edges for x
1
edges for x
2
edges for x
3
if we can nd a k-clique for k clauses, then the clique
represents a certicate.
x
1 x
2
x
3
x
1
x
2
x
3
x
1
x
2
x
3
in the example, there are several solutions!
57
Lecture 15
Approx. Algorithms
This lecture
Approximation algorithms
ratio bound: bounding cost of suboptimal solution within
some factor from ideal
15.1 k-VERTEX-COVER problem
input: undirected graph G(V, E)
denition: vertex cover = subset of V thats part of all
edges.
k-vertex cover: a vertex cover with k vertices
NP Complete
NP: easy: test whether each vertex in vertex cover is u
or v for every edge (u, v).
reduce k-CLIQUE
p
k-VERTEX-COVER
15.1.1 Observation:
Graph G has a k-CLIQUE iff G-complement has a ([V[
k) vertex cover.
G = (V, E): complement of G = (V, E).
i.e., G has same vertices as G, but only those edges that
are not in G.
Claim: if V
/
is a k-clique, then V V
/
is a vertex cover
in G.
u
z
v
w
y x
u
z
v
w
y x
complement graph with
vertex cover = { w, z }
= V clique C
Graph G with
six vertices
clique of size 4
C = { u, v, x, y }
Proof: V G are guaranteed not to have edges between
each other; so they are entirely unnecessary in the vertex
cover of G. (or else theyd be redundant)
since G contains all other vertices, all other edges will be
covered.
15.1.2 Approximation algorithm for Vertex Cover
decision problem is NP Complete
nding an optimal vertex cover is NP-hard.
However: if we just want to nd a reasonably good
vertex cover, can solve it in polynomial time.
APPROX-VERTEX-COVER(G)
1 C / 0
2 E
/
E[G]
3 while E
/
,= / 0
4 do (u, v) EXTRACTRANDOM(E
/
)
5 C Cu, v
6 E
/
E
/
all edges in the form (u, x) or (x, v)
7 return C
This algorithm has a ratio bound of 2.
b
a
c d
e f g
C = { }
b
a
c d
e f g
pick edge (b, c)
C = { b, c }
remove all edges incident to b or c
b
a
c d
e f g
pick edge (e, f)
C = { b, c, e, f }
remove all edges incident to e or f
b
a
c d
e f g
pick edge (d, g)
C = { b, c, e, f, d, g }
remove all edges incident to d or g
approximated vertex cover = { b, c, e, f, d, g }
twice as many as optimal vertex cover = { b, d, e }
b
a
c d
e f g
58
First: C is a vertex cover by construction.
(an edge is removed after its covered by either u or v)
C always adds two vertices per iteration.
The best that an optimal solution can do is to use one
instead of two vertices to cover those same vertices.
15.2 Hamiltonian (HAM-CYCLE)
recall: Hamiltonian cycle = simple cycle that visits all
vertices exactly once.
HAM-CYCLE (decision problem) is NP Complete.
proof:
1) NP: easy; walk the path
2b VERTEX-COVER
p
HAM-CYCLE
use a 12-vertex widget for each edge
detail not important
Issue: No approximation of HAM-CYCLE
15.3 Traveling Salesman Problem (TSP)
TSP =G, c, k) : G = (V, E)a complete graph,
c : V V Z
k Z,
tour A with cost c(A) k
c(A) =
(u,v)A
c(u, v)
Hamiltonian cycle on a complete graph with cost k
easy to show NP Complete:
1) NP : test if cost k
2b) reduction: Hamiltonian
p
TSP
c(i, j) =
_
0 if there is an edge (i, j)
1 if (i, j) , E; assume no self loop (v, v)
then check for TSP with cost k = 0 (real edges).
any non-edges will have cost > 0.
15.3.1 Approximated TSP
Near-optimal solution assuming triangle inequality:
simplifying assumption: going direct is always cheaper
than a hop:
c(u, w) c(u, v) +c(v, w)
Approximation: with Min-spanning tree.
Ratio-bound of 2.
APPROX-TSP-TOUR(G, c)
pick some vertex r V as root
grow min-spanning tree using MST-PRIM(G, c, r)
return pre-order tree walk of T
(i.e., visit root before visiting children recursively
as a hamiltonian cycle)
b
a
c
d
e
f g
h
b
a
c
d
e
f g
h
preorder: a, b, c, h, d, e, f, g
1
2
3
4
5
6
7
8
Approximated TSP, assuming
triangular inequality
Proof:
First: Min-spanning tree tour would be 2x optimal.
This is because at each point youre going the shortest
distance, but you do it twice.
Second: the pre-order traversal order would jump direct,
rather than backtrack to go indirectly. so it can only do
better.
There are better approximations. 2 is still not very
good.
15.3.2 General TSP:
does not have a ratio bound () unless P = NP
Proof by contradiction: if such exists then can reduce
Hamiltonian cycle to TSP by cost assignment:
1. form complete graph E
/
from E such that
c(u, v) =
_
1 if (u, v) E,
[V[ +1 otherwise
b
a
c
d
e
b
a
c
d
e
HAM-CYCLE?
cost = 1 (original)
cost = |V| + 1
Solve as an
instance of TSP
w/ ratio bound =
2. optimal TSP is to take only edges in E. (unit weight).
taking any purple edge means
cost ([V[ +1)
. .
+([V[ 1)
. .
purple cost remaining originals
= (+1)[V[
> [V[
approximation algorithm for TSP with ratio bound ex-
ists only if P = NP.
59
15.4 Subset-sum problem
input: S, t) : S =x
1
, x
2
, ...x
n
set of integers;
t = integer target sum, certicate S
/
= subset of S.
Example: S =1, 2, 7, 12, 18, t = 27
S
/
=2, 7, 18 S because 2+7+18 = 27.
15.4.1 Exact solution:
enumerate set of all possible subsets for the sums.
there are 2
n
such subsets (two choices with each mem-
ber: include or exclude.)
iterative algorithm to enumerate sums, prune out sums
>t.
15.4.2 NP Complete proof
1) NP obvious (just sum up)
2b) NPH reduction from 3-CNF-SAT
express clauses as numbers
C
1
= (x
1
x
2
),C
2
= (x
1
x
2
)
x
1
x
2
C
1
C
2
v
1
1 0 1 0
v
/
1
1 0 0 1
v
2
0 1 0 1
v
/
2
0 1 1 0
s
1
0 0 1 0
s
/
1
0 0 2 0
s
2
0 0 0 1
s
/
2
0 0 0 2
t 1 1 4 4
v
i
has 1 in x
i
column, 0 in x
j
columns (i ,= j), and
1 if x
i
appears in clause C
k
, 0 if not.
v
/
i
has 1 in x
i
column, 0 in x
j
columns (i ,= j), and
1 if x
i
does not appear in clause C
k
, 0 if does
s
i
has all 0 in x
j
columns, 1 in column C
i
, 0 otherwise
s
/
i
has all 0 in x
j
columns, 2 in column C
i
, 0 otherwise.
interpret these as decimal numbers (no carry can happen)
15.4.3 polynomial-time approximation
idea: compute a sum that is close to the target sum within
a factor of 1+, 0 < < 1, rather than exactly
trim out an element y if it is can be approximated by z:
y
1+
z y
APPROX-SUBSET-SUM(S, t, )
1 n [S[
2 L
0
0)
3 for i 1 to n
4 do L
i
MERGE(L
i1
, L
i1
+x
i
),
5 L
i
TRIM(L
i
, /2n)
6 L
i
L
i
elements >t
7 return z

, which is largest value L


n
Let S =104, 102, 201, 101), t = 307
L
i
x
i
L
i+1
0) 104 0, 104)
0, 104) 102 0, 102, 104, 206)
0, 102, 206) 201 0, 102, 201, 206, 303, 407)
0, 102, 201, 303) 101 0, 101, 102, 201, 203, 302, 303, 404)
Finally L = 0, 101, 201, 302), z

= 302(= 201 + 101)


307(= 101+102+104)
15.5 Set Covering problem
input (X, F ) :
X = set (e.g. 1, 2, 3, ..., 12)
F = set of subsets of X
(e.g., S
1
, S
2
, S
3
, S
4
, S
5
, S
6
where
S
1
=1, 2, 3, 4, 5, 6,
S
2
=5, 6, 8, 9,
S
3
=1, 4, 7, 10,
S
4
=2, 5, 7, 8, 11,
S
5
=3, 6, 9, 12,
S
6
=10, 11.
Denition: a member S F covers elements of S.
e.g., S
6
covers 10, 11
Goal: nd C F that covers X, minimum [C[.
Real-Life Analogy: each set S
i
=person i with skills X
Goal is to hire smallest group of people with all the skills
X.
The problem of SET-COVER is NP complete.
Why? reduction from VERTEX-COVER problem:
X = set of vertices, F = set of edges
15.5.1 Greedy approximation for SET-COVER
intuition: Hire one person at a time, each time try to hire
one to cover the max # of uncovered skills
Example:
S
1
=1, 2, 3, 4, 5, 6,
S
2
=5, 6, 8, 9,
S
3
=1, 4, 7, 10,
S
4
=2, 5, 7, 8, 11,
S
5
=3, 6, 9, 12,
S
6
=10, 11.
60
U Pick S
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 S
1
=1, 2, 3, 4, 5, 6
7, 8, 9, 10, 11, 12 S
4
=2, 5, 7, 8, 11
9, 10, 12 S
5
=3, 6, 9, 12
10 S
3
=10, 11
GREEDY-SET-COVER(X, F )
1 U X
2 C / 0
3 while U ,= / 0
4 do pick S F to maximize U S
equivalently maximize [SU[
5 U U S
6 C C S
7 return C
Polynomial time!
GREEDY-SET-COVER has a ratio bound (n) =
H(max[S[ : S F )
i.e. harmonic series, equivalently (ln[X[ +1)
(proof on page 1036... dont worry about it)
better bound than APPROX-VERTEX-COVER
61