Professional Documents
Culture Documents
Algorithms
F O U R T H E D I T I O N
! ordered operations
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
! ordered operations
Symbol tables
Key-value pair abstraction.
Insert a value with specified key. Given a key, search for the corresponding value.
Ex. DNS lookup.
Insert domain name with specified IP address. Given domain name, find corresponding IP address.
domain name www.cs.princeton.edu www.princeton.edu www.yale.edu www.harvard.edu www.simpsons.com IP address 128.112.136.11 128.112.128.15 130.132.143.21 128.103.060.55 209.052.165.60
key
value
3
application dictionary book index file share financial account web search compiler routing table DNS reverse DNS genomics file system
purpose of search find definition find relevant pages find song to download process transactions find relevant web pages find properties of variables route Internet packets find IP address find domain name find markers find file on disk
key word term name of song account number keyword variable name destination domain name IP address DNA string filename
value definition list of page numbers computer ID transaction details list of page names type and value best route IP address domain name known positions location on disk
3.1
Symbol Tables
363
The symbol table is a prototypical abstract data type (see Chapter 1): it represents a well-dened set of values and operations on those values, enabling us to develop Associative array abstraction. Associate one valuedene with the each key. clients and implementations separately. As usual, we precisely operations by specifying an applications programming interface (API) that provides the contract between client and implementation:
public class ST<Key, Value> ST() void put(Key key, Value val) Value get(Key key) void delete(Key key) boolean contains(Key key) boolean isEmpty() int size() Iterable<Key> keys()
create a symbol table put key-value pair into the table (remove key from table if value is null) value paired with key (null if key is absent) remove key (and its value) from table is there a value paired with key? is the table empty? number of key-value pairs in the table all the keys in the table
API for a generic basic symbol table
a[key] = val; a[key]
Before examining client code, we consider several design choices for our implementations to make our code consistent, compact, and useful.
Generics. As we did with sorting, we will consider the methods without specifying the types of the items being processed, using generics. For symbol tables, we emphasize the separate roles played by keys and values in search by specifying the key and value types
Conventions
Values are not null. Method get() returns null if key not present. Method put() overwrites old value with new value.
Intended consequences.
Assume keys are Comparable, use compareTo(). Assume keys are any generic type, use equals() to test equality. Assume keys are any generic type, use equals() to test equality;
use hashCode() to scramble key.
built-in to Java (stay tuned)
Immutable in Java: Integer, Double, String, java.io.File, Mutable in Java: StringBuilder, java.net.URL, arrays, ...
7
Equality test
All Java classes inherit a method equals().
x.equals(x) x.equals(y)
equivalence relation
Default implementation. (x == y) Customized implementations. Integer, Double, String, java.io.File, User-defined implementations. Some care needed.
if (this.day != that.day ) return false; if (this.month != that.month) return false; if (this.year != that.year ) return false; return true; } }
public final class Date implements Comparable<Date> { private final int month; private final int day; private final int year; ... public boolean equals(Object y) { if (y == this) return true; if (y == null) return false; if (y.getClass() != this.getClass()) return false; Date that = (Date) y; if (this.day != that.day ) return false; if (this.month != that.month) return false; if (this.year != that.year ) return false; return true; } }
optimize for true object equality check for null objects must be in the same class (religion: getClass() vs. instanceof)
cast is guaranteed to succeed check that all significant fields are the same
10
Equals design
"Standard" recipe for user-defined types.
Optimization for reference equality. Check against null. Check that two objects are of the same type and cast. Compare each significant field:
if field is a primitive type, use == if field is an object, use equals() if field is an array, apply to each entry
apply rule recursively alternatively, use Arrays.equals(a, b) or Arrays.deepEquals(a, b), but not a.equals(b)
Best practices.
No need to use calculated fields that depend on other fields. Compare fields mostly likely to differ first. Make compareTo() consistent with equals().
x.equals(y) if and only if (x.compareTo(y) == 0)
11
public static void main(String[] args) { ST<String, Integer> st = new ST<String, Integer>(); for (int i = 0; !StdIn.isEmpty(); i++) keys S E A R C H { values 0 1 2 3 4 5 String key = StdIn.readString(); st.put(key, i); output for basic symbol table } (one possibility) for (String s : st.keys()) L 11 StdOut.println(s + " " + st.get(s)); } P 10
E 6
X 7
A 8
9 10 11 12
A C E H L M P R S X
8 4 12 5 11 9 10 3 0 7
12
keys
E 1
A 2
R 3
C 4
H 5
E 6
X 7
A 8
M 9
values 0
M X H C P L ER 10 11 12A E S
9 7 5 4 3 8 12 0
11
% java FrequencyCounter 1 < tinyTale.txt it 10 % java FrequencyCounter 8 < tale.txt business 122 % java FrequencyCounter 10 < leipzig1M.txt government 24763
public class FrequencyCounter { public static void main(String[] args) { int minlen = Integer.parseInt(args[0]); ST<String, Integer> st = new ST<String, Integer>(); while (!StdIn.isEmpty()) { String word = StdIn.readString(); ignore short strings if (word.length() < minlen) continue; if (!st.contains(word)) st.put(word, 1); else st.put(word, st.get(word) + 1); } String max = ""; st.put(max, 0); for (String word : st.keys()) if (st.get(word) > st.get(max)) max = word; StdOut.println(max + " " + st.get(max)); } }
create ST
14
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
! ordered operations
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
! ordered operations
S E A R C H E X A M
0 1 2 3 4 5 6 7 8 9
S 0 E 1 A 2 R 3 C 4 H 5 H 5 X 7 X 7 M 9 P 10 L 11 L 11
red nodes are new S 0 E 1 A 2 R 3 C 4 C 4 H 5 H 5 X 7 M 9 P 10 P 10 S 0 E 1 A 2 R 3 R 3 C 4 C 4 H 5 X 7 M 9 M 9 S 0 E 1 A 2 A 2 R 3 R 3 C 4 H 5 X 7 X 7 S 0 E 1 E 6 A 2 A 8 R 3 C 4 H 5 H 5 S 0 S 0 E 6 E 6 A 8 R 3 C 4 C 4 S 0 S 0 E 6 A 8 R 3 R 3 gray nodes are untouched S 0 E 6 A 8 A 8 S 0 E 6 E 12 S 0 S 0 circled entries are changed values black nodes are accessed in search
P 10 L 11 E 12
ST implementation
ordered iteration?
key interface
N/2
no
equals()
18
0 A A A A
1 C C C C
2 E E E E
keys[] 3 4 5 6 H H H H L L L L M M M M P P P P
7 R R R R
8 S S S S
lo hi m 0 9 4 5 5 6 9 6 6 7 5 6
lo hi m 0 9 4 5 9 7 5 7 6 6 5 6
loop exits with lo > hi: return 7 Trace of binary search for rank in an ordered array
19
}
20
key value
0 S E A A A A A A A A A A A A
1 S E E C C C C C C C C C C
keys[] 3 4 5 6
N 1
0 0 1 2 2 2 2 2 2 8 8 8 8 8 8
1 0 1 1 4 4 4 4 4 4
vals[] 3 4 5 6
S E A R C H E X A M P L E
0 1 2 3 4 5 6 7 8 9 10 11 12
S R E E E E E E E E E E
X X S R P P P
0 3 1 1 6 6 6 6
7 0 3 3 3 7 0 0 0 7 7 7
4 6 4 6 4 12 4 12
5 9 10 3 5 11 9 10 5 11 9 10 5 11 9 10
21
ST implementation
ordered iteration?
key interface
N/2
no
equals()
log N
log N
N/2
yes
compareTo()
22
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
! ordered operations
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
! ordered operations
values Chicago Phoenix Houston Chicago Houston Chicago Seattle Seattle Phoenix Chicago Chicago Chicago Seattle Seattle Chicago Chicago Seattle Phoenix
keys(09:15:00, 09:25:00)
09:00:00 09:00:03 09:00:13 09:00:59 09:01:10 09:03:13 09:10:11 09:10:25 09:14:25 09:19:32 09:19:46 09:21:05 09:22:43 09:22:54 09:25:52 09:35:21 09:36:14 09:37:44
25
pose that your keys are times of the day. You might be interested in knowing the earliest or the latest time, the set of keys that fall between two given times, and so forth. In most cases, such operations are not difcult to implement with the same data structures and Ordered symbol table API and get() implementations. Specically, for applicamethods underlying the put() tions where keys are Comparable, we implement in this chapter the following API:
public class ST<Key extends Comparable<Key>, Value> ST() void put(Key key, Value val) Value get(Key key) void delete(Key key) boolean contains(Key key) boolean isEmpty() int size() Key min() Key max() Key floor(Key key) Key ceiling(Key key) int rank(Key key) Key select(int k) void deleteMin() void deleteMax() int size(Key lo, Key hi) Iterable<Key> keys(Key lo, Key hi) Iterable<Key> keys()
create an ordered symbol table put key-value pair into the table (remove key from table if value is null) value paired with key (null if key is absent) remove key (and its value) from table is there a value paired with key? is the table empty? number of key-value pairs smallest key largest key largest key less than or equal to key smallest key greater than or equal to key number of keys less than key key of rank k delete smallest key delete largest key number of keys in [lo..hi] keys in [lo..hi], in sorted order all keys in the table, in sorted order
API for a generic ordered symbol table
26
sequential search search insert / delete min / max floor / ceiling rank select ordered iteration N N N N N N N lg N
binary search lg N N 1 lg N lg N 1 N
order of growth of the running time for ordered symbol table operations
27
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
! ordered operations
Algorithms
Algorithms
F O U R T H E D I T I O N
! ordered operations