For many problems some extra space really pays off
(extra space in tables - breathing room)
 input enhancement
• non comparison-based sorting
• auxiliary tables (shift tables for pattern matching)

prestructuring
• hashing
• indexing schemes (eg, B-trees)

tables of information that do all the work
• dynamic programming
Design and Analysis of Algorithms - Chapter 7

1

Sorting by Counting
Algorithm ComparisonCountingSort(A[0..n-1])
for i  0 to n-1 do Count[i] 0
for i  0 to n-2 do
for j  i+1 to n-1 do
if A[i] <A[j] then Count[j]  Count[j]+1
else Count[i]  Count[i]+1
for i  0 to n-1 do S[Count[i]]  A[i]
 Example: 62 31 84 96 19 47
 Efficiency
Design and Analysis of Algorithms - Chapter 7

2

Sorting by Counting (2) Algorithm DistributionCountingSort(A[0.n-1]) for j  0 to u-l do D[j] 0 for i  0 to n-1 do D[A[i]-l]  D[A[i]-l] + 1 for j  1 to u-l do D[j]  D[j-1]+D[j] for i  n-1 down to 0 do j  A[i]-l.Chapter 7 3 . D[j]  D[j]-1  Example: 13 11 12 13 12 12  Efficiency Design and Analysis of Algorithms .. S[D[j]-1]  A[i].

Chapter 7 4 . Align pattern at beginning of text 2. realign pattern one position to the right and repeat step 2.String matching  pattern: a string of m characters to search for text: a (long) string of n characters to search in  Brute force algorithm:  1. compare each character of pattern to the corresponding character in text until – all characters are found to match (successful search). Design and Analysis of Algorithms . moving from left to right. while pattern is not found and the text is not yet exhausted. or – a mismatch is detected 3.

Morris independently discovers same algorithm in attempt to avoid “backing up” over text At about the same time Boyer and Moore find an algorithm that examines only a fraction of the text in most cases (by comparing characters in pattern and text from right to left.History     1970: Cook shows (using finite-state machines) that problem can be solved in time proportional to n+m 1976 Knuth and Pratt find algorithm based on Cook’s idea. instead of left to right) 1980 Another algorithm proposed by Rabin and Karp virtually always runs in time proportional to n+m and has the advantage of extending easily to two-dimensional pattern matching and being almost as simple as the bruteforce method.String searching .Chapter 7 5 . Design and Analysis of Algorithms .

create a shift table that determines how much to shift the pattern when a mismatch occurs (input enhancement) Design and Analysis of Algorithms .Chapter 7 6 .Horspool’s Algorithm  A simplified version of Boyer-Moore algorithm that retains key insights: • compare pattern characters to text from right to left • given a pattern.

.................c.................................... Three cases: • The character is not in the pattern .... (O occurs once in pattern) BAOBAB . (A occurs twice in pattern) BAOBAB • The rightmost characters produced a match . BAOBAB  Shift Table: Stores number of characters to shift by depending on first character compared Design and Analysis of Algorithms .A...B...Chapter 7 7 .................O... (c not in pattern) BAOBAB • The character is in the pattern (but not at rightmost position) ....................How far to shift? Look at first (rightmost) character in text that was compared.....

Chapter 7 8 .Shift table  Constructed by scanning pattern before search begins  All entries are initialized to length of pattern.m-1]) for i 0 to size-1 do Table[i] m for j 0 to m-2 do Table[P[j]] m-1-j return Table Design and Analysis of Algorithms . update table entry to distance of rightmost occurrence of c from end of pattern   Algorithm ShiftTable(P[0. For c occurring in pattern..

Shift table  Example for pattern BAOBAB: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6  Then: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Design and Analysis of Algorithms .Chapter 7 9 .

.m-1..n-1]]) ShiftTable(P[0.T[0.m-1]) i  m-1 while i<=n-1 do k 0 while k<=m-1 and P[m-1-k]=T[i-k] do k  k+1 if k=m return i-m+1 else i  i+Table[T[i]] return -1 Design and Analysis of Algorithms .Chapter 7 10 .The Algorithm  Horspool Matching(P[0..

create a shift table that determines how much to shift the pattern when a mismatch occurs (input enhancement) Uses additional shift table with same idea applied to the number of matched characters Design and Analysis of Algorithms .Chapter 7 11 .Boyer-Moore algorithm   Based on same two ideas: • compare pattern characters to text from right to left • given a pattern.

then shift in the same way by m characters (actually c is a bad symbol) • If the mismatching character (the bad symbol) does not appear in the pattern. then shift to overpass it • If the mismathcing character (the bad symbol) appears in the pattern. Design and Analysis of Algorithms . then shift to align the bad symbol to the same text character (lying to the left of the mismatching position).Chapter 7 12 . the text character corresponding to the last pattern character. this table is computed differently • If c. is not in the pattern.The bad-symbol shift   Based on the Horspool idea of using the extra table However.

.. This shift is given by: d=max[t1(c)-k...AER........... k the distance between the bad symbol from the end of the pattern Design and Analysis of Algorithms .1].. BARBER BARBER shift 2 positions........Chapter 7 13 ....example The bad symbol IS NOT in the pattern .........SER. BARBER BARBER shift 4 positions The bad symbol IS in the pattern . where t1 is the Horspool table..The bad-symbol shift .............

Chapter 7 14 . Calculate the shift as the distance between the suffix and the prefix.  Also.The good-suffix shift . Calculate the shift as the distance between two occurrences of the suffix. ABRACADABRA)  Important to find another suffix with a different previous character. important to find the longest prefix of size l<k that matches the suffix of size l.  Design and Analysis of Algorithms .example What happens if a matched suffix appears again in the pattern (eg.

Chapter 7 15 .The good-suffix shift – example (2) K pattern d2 K pattern d2 1 ABCBAB 2 1 BAOBAB 2 2 ABCBAB 4 2 BAOBAB 5 3 ABCBAB 4 3 BAOBAB 5 4 ABCBAB 4 4 BAOBAB 5 5 ABCBAB 4 5 BAOBAB 5 Design and Analysis of Algorithms .

d2) if k>0 where d1=max(t1(c)-k) {  Example BESS_KNEW_ABOUT_BAOBABS BAOBAB Design and Analysis of Algorithms .Chapter 7 16 .Final rule for Boyer-Moore  Calculate shift as d1 if k=0 d= max(d1.

i..e.Hashing  A very efficient method for implementing a dictionary.Chapter 7 17 . a set with the operations: – insert – find – delete  Applications: – databases – symbol tables Design and Analysis of Algorithms .

where is record with SSN= 315-17-4251 stored?  Hash function must:   • be easy to compute • distribute keys evenly throughout the table Design and Analysis of Algorithms . Hash function: h(k) = k mod m (k is a key and m is the number of buckets) • if m=1000.Hash tables and hash functions  Hash table: an array with indices that correspond to buckets Hash function: determines the bucket for each record Example: student records.Chapter 7 18 . key=SSN.

in case of collision.Collisions     If h(k1) = h(k2) then there is a collision.bucket points to linked list of all keys hashing to it. • Closed hashing – one key per bucket.Chapter 7 19 . Two types handle collisions differently: • Open hashing . find another bucket for one of the keys – linear probing: use next bucket – double hashing: use second hash function to compute increment Design and Analysis of Algorithms . Good hash functions result in fewer collisions. Collisions can never be completely eliminated.

Open hashing If hash function distributes keys uniformly. U = α  Worst-case is still linear!  Open hashing still works if n>m.Chapter 7 20 . average length of linked list will be n/m  Average number of probes S = 1+α/2.  Design and Analysis of Algorithms .

Avoids pointers.α)²)  As the table gets filled (α approaches 1).Closed hashing     Does not work if n>m.α)) – unsuccessful search: (½) (1+ 1/(1.Chapter 7 21 . Number of probes to insert/find/delete a key depends on load factor α = n/m (hash table density) – successful search: (½) (1+ 1/(1. number of probes increases dramatically: Design and Analysis of Algorithms . Deletions are not straightforward.