You are on page 1of 29

Algorithms

R OBERT S EDGEWICK | K EVIN W AYNE

3.1 S YMBOL T ABLES


! API ! elementary implementations

Algorithms
F O U R T H E D I T I O N

! ordered operations

R OBERT S EDGEWICK | K EVIN W AYNE


http://algs4.cs.princeton.edu

3.1 S YMBOL T ABLES


! API ! elementary implementations

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

! ordered operations

Symbol tables
Key-value pair abstraction.

Insert a value with specified key. Given a key, search for the corresponding value.
Ex. DNS lookup.

Insert domain name with specified IP address. Given domain name, find corresponding IP address.
domain name www.cs.princeton.edu www.princeton.edu www.yale.edu www.harvard.edu www.simpsons.com IP address 128.112.136.11 128.112.128.15 130.132.143.21 128.103.060.55 209.052.165.60

key

value
3

Symbol table applications

application dictionary book index file share financial account web search compiler routing table DNS reverse DNS genomics file system

purpose of search find definition find relevant pages find song to download process transactions find relevant web pages find properties of variables route Internet packets find IP address find domain name find markers find file on disk

key word term name of song account number keyword variable name destination domain name IP address DNA string filename

value definition list of page numbers computer ID transaction details list of page names type and value best route IP address domain name known positions location on disk

3.1

Symbol Tables

363

Basic symbol table API


API

The symbol table is a prototypical abstract data type (see Chapter 1): it represents a well-dened set of values and operations on those values, enabling us to develop Associative array abstraction. Associate one valuedene with the each key. clients and implementations separately. As usual, we precisely operations by specifying an applications programming interface (API) that provides the contract between client and implementation:
public class ST<Key, Value> ST() void put(Key key, Value val) Value get(Key key) void delete(Key key) boolean contains(Key key) boolean isEmpty() int size() Iterable<Key> keys()

create a symbol table put key-value pair into the table (remove key from table if value is null) value paired with key (null if key is absent) remove key (and its value) from table is there a value paired with key? is the table empty? number of key-value pairs in the table all the keys in the table
API for a generic basic symbol table
a[key] = val; a[key]

Before examining client code, we consider several design choices for our implementations to make our code consistent, compact, and useful.

Generics. As we did with sorting, we will consider the methods without specifying the types of the items being processed, using generics. For symbol tables, we emphasize the separate roles played by keys and values in search by specifying the key and value types

Conventions

Values are not null. Method get() returns null if key not present. Method put() overwrites old value with new value.
Intended consequences.

Easy to implement contains().


public boolean contains(Key key) { return get(key) != null; }

Can implement lazy version of delete().


public void delete(Key key) { put(key, null); }

Keys and values


Value type. Any generic type.

specify Comparable in API.

Key type: several natural assumptions.

Assume keys are Comparable, use compareTo(). Assume keys are any generic type, use equals() to test equality. Assume keys are any generic type, use equals() to test equality;
use hashCode() to scramble key.
built-in to Java (stay tuned)

Best practices. Use immutable types for symbol table keys.

Immutable in Java: Integer, Double, String, java.io.File, Mutable in Java: StringBuilder, java.net.URL, arrays, ...
7

Equality test
All Java classes inherit a method equals().

Java requirements. For any references x, y and z:

Reflexive: Symmetric: Transitive: Non-null:

x.equals(x) x.equals(y)

is true. iff y.equals(x). is false.

equivalence relation

if x.equals(y) and y.equals(z), then x.equals(z).


x.equals(null)

do x and y refer to the same object?

Default implementation. (x == y) Customized implementations. Integer, Double, String, java.io.File, User-defined implementations. Some care needed.

Implementing equals for user-dened types


Seems easy.
public class Date implements Comparable<Date> { private final int month; private final int day; private final int year; ... public boolean equals(Date that) {

if (this.day != that.day ) return false; if (this.month != that.month) return false; if (this.year != that.year ) return false; return true; } }

check that all significant fields are the same

Implementing equals for user-dened types


Seems easy, but requires some care.
typically unsafe to use equals() with inheritance (would violate symmetry)

public final class Date implements Comparable<Date> { private final int month; private final int day; private final int year; ... public boolean equals(Object y) { if (y == this) return true; if (y == null) return false; if (y.getClass() != this.getClass()) return false; Date that = (Date) y; if (this.day != that.day ) return false; if (this.month != that.month) return false; if (this.year != that.year ) return false; return true; } }

must be Object. Why? Experts still debate.

optimize for true object equality check for null objects must be in the same class (religion: getClass() vs. instanceof)

cast is guaranteed to succeed check that all significant fields are the same

10

Equals design
"Standard" recipe for user-defined types.

Optimization for reference equality. Check against null. Check that two objects are of the same type and cast. Compare each significant field:
if field is a primitive type, use == if field is an object, use equals() if field is an array, apply to each entry
apply rule recursively alternatively, use Arrays.equals(a, b) or Arrays.deepEquals(a, b), but not a.equals(b)

Best practices.

No need to use calculated fields that depend on other fields. Compare fields mostly likely to differ first. Make compareTo() consistent with equals().
x.equals(y) if and only if (x.compareTo(y) == 0)
11

ST test client for traces


Build ST by associating value i with ith string from standard input.

public static void main(String[] args) { ST<String, Integer> st = new ST<String, Integer>(); for (int i = 0; !StdIn.isEmpty(); i++) keys S E A R C H { values 0 1 2 3 4 5 String key = StdIn.readString(); st.put(key, i); output for basic symbol table } (one possibility) for (String s : st.keys()) L 11 StdOut.println(s + " " + st.get(s)); } P 10

E 6

X 7

A 8

9 10 11 12

output for output ordered symbol table

A C E H L M P R S X

8 4 12 5 11 9 10 3 0 7
12

keys

E 1

A 2

R 3

C 4

H 5

E 6

X 7

A 8

M 9

values 0

output for basic symbol table (one possibility)

output for ordered symbol table

M X H C P L ER 10 11 12A E S

9 7 5 4 3 8 12 0

11

Keys, values, and output for test client

ST test client for analysis


Frequency counter. Read a sequence of strings from standard input and print out one that occurs with highest frequency.
% more it was it was it was it was it was it was it was it was it was it was tinyTale.txt the best of times the worst of times the age of wisdom the age of foolishness the epoch of belief the epoch of incredulity the season of light the season of darkness the spring of hope the winter of despair
tiny example (60 words, 20 distinct) real example (135,635 words, 10,769 distinct) real example (21,191,455 words, 534,580 distinct)
13

% java FrequencyCounter 1 < tinyTale.txt it 10 % java FrequencyCounter 8 < tale.txt business 122 % java FrequencyCounter 10 < leipzig1M.txt government 24763

Frequency counter implementation

public class FrequencyCounter { public static void main(String[] args) { int minlen = Integer.parseInt(args[0]); ST<String, Integer> st = new ST<String, Integer>(); while (!StdIn.isEmpty()) { String word = StdIn.readString(); ignore short strings if (word.length() < minlen) continue; if (!st.contains(word)) st.put(word, 1); else st.put(word, st.get(word) + 1); } String max = ""; st.put(max, 0); for (String word : st.keys()) if (st.get(word) > st.get(max)) max = word; StdOut.println(max + " " + st.get(max)); } }

create ST

read string and update frequency

print a string with max freq

14

3.1 S YMBOL T ABLES


! API ! elementary implementations

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

! ordered operations

3.1 S YMBOL T ABLES


! API ! elementary implementations

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

! ordered operations

Sequential search in a linked list


Data structure. Maintain an (unordered) linked list of key-value pairs. Search. Scan through all keys until find a match. Insert. Scan through all keys until find a match; if no match add to front.
key value first

S E A R C H E X A M

0 1 2 3 4 5 6 7 8 9

S 0 E 1 A 2 R 3 C 4 H 5 H 5 X 7 X 7 M 9 P 10 L 11 L 11

red nodes are new S 0 E 1 A 2 R 3 C 4 C 4 H 5 H 5 X 7 M 9 P 10 P 10 S 0 E 1 A 2 R 3 R 3 C 4 C 4 H 5 X 7 M 9 M 9 S 0 E 1 A 2 A 2 R 3 R 3 C 4 H 5 X 7 X 7 S 0 E 1 E 6 A 2 A 8 R 3 C 4 H 5 H 5 S 0 S 0 E 6 E 6 A 8 R 3 C 4 C 4 S 0 S 0 E 6 A 8 R 3 R 3 gray nodes are untouched S 0 E 6 A 8 A 8 S 0 E 6 E 12 S 0 S 0 circled entries are changed values black nodes are accessed in search

P 10 L 11 E 12

Trace of linked-list ST implementation for standard indexing client


17

Elementary ST implementations: summary

ST implementation

worst-case cost (after N inserts) search insert

average case (after N random inserts) search hit insert

ordered iteration?

key interface

sequential search (unordered list)

N/2

no

equals()

Challenge. Efficient implementations of both search and insert.

18

Binary search in an ordered array


Data structure. Maintain an ordered array of key-value pairs. Rank helper function. How many keys < k ?

successful search for P

0 A A A A

1 C C C C

2 E E E E

keys[] 3 4 5 6 H H H H L L L L M M M M P P P P

7 R R R R

8 S S S S

9 X X X X entries in black are a[lo..hi] entry in red is a[m]

lo hi m 0 9 4 5 5 6 9 6 6 7 5 6

unsuccessful search for Q

loop exits with keys[m] = P: return 6 A A A A C C C C E E E E H H H H L L L L M M M M P P P P R R R R S S S S X X X X

lo hi m 0 9 4 5 9 7 5 7 6 6 5 6

loop exits with lo > hi: return 7 Trace of binary search for rank in an ordered array
19

Binary search: Java implementation


public Value get(Key key) { if (isEmpty()) return null; int i = rank(key); if (i < N && keys[i].compareTo(key) == 0) return vals[i]; else return null; }
number of keys < key private int rank(Key key) { int lo = 0, hi = N-1; while (lo <= hi) { int mid = lo + (hi - lo) / 2; int cmp = key.compareTo(keys[mid]); if (cmp < 0) hi = mid - 1; else if (cmp > 0) lo = mid + 1; else if (cmp == 0) return mid; } return lo;

}
20

Binary search: trace of standard indexing client


Problem. To insert, need to shift all greater keys over.

key value

0 S E A A A A A A A A A A A A

1 S E E C C C C C C C C C C

keys[] 3 4 5 6

N 1

0 0 1 2 2 2 2 2 2 8 8 8 8 8 8

1 0 1 1 4 4 4 4 4 4

vals[] 3 4 5 6

S E A R C H E X A M P L E

0 1 2 3 4 5 6 7 8 9 10 11 12

S R E E E E E E E E E E

entries in red were inserted S R H H H H H H H H H S R R R R M M L L L S S S S R P M M M

X X S R P P P

2 3 4 entries in gray 5 did not move 6 6 7 7 X 8 S R R X S S S X X X 9 10 10

0 3 1 1 6 6 6 6

entries in black moved to the right 0 3 5 5 5 5 5

0 3 3 3 3 9 0 0 0 0 3 circled entries are changed values 7 7 0

7 0 3 3 3 7 0 0 0 7 7 7

4 6 4 6 4 12 4 12

5 9 10 3 5 11 9 10 5 11 9 10 5 11 9 10

Trace of ordered-array ST implementation for standard indexing client

21

Elementary ST implementations: summary

ST implementation

worst-case cost (after N inserts) search insert

average case (after N random inserts) search hit insert

ordered iteration?

key interface

sequential search (unordered list)

N/2

no

equals()

binary search (ordered array)

log N

log N

N/2

yes

compareTo()

Challenge. Efficient implementations of both search and insert.

22

3.1 S YMBOL T ABLES


! API ! elementary implementations

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

! ordered operations

3.1 S YMBOL T ABLES


! API ! elementary implementations

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

! ordered operations

Examples of ordered symbol table API


keys
min()

values Chicago Phoenix Houston Chicago Houston Chicago Seattle Seattle Phoenix Chicago Chicago Chicago Seattle Seattle Chicago Chicago Seattle Phoenix

get(09:00:13) floor(09:05:00) select(7)

keys(09:15:00, 09:25:00)

ceiling(09:30:00) max() size(09:15:00, 09:25:00) is 5 rank(09:10:25) is 7

09:00:00 09:00:03 09:00:13 09:00:59 09:01:10 09:03:13 09:10:11 09:10:25 09:14:25 09:19:32 09:19:46 09:21:05 09:22:43 09:22:54 09:25:52 09:35:21 09:36:14 09:37:44

Examples of ordered symbol-table operations

25

pose that your keys are times of the day. You might be interested in knowing the earliest or the latest time, the set of keys that fall between two given times, and so forth. In most cases, such operations are not difcult to implement with the same data structures and Ordered symbol table API and get() implementations. Specically, for applicamethods underlying the put() tions where keys are Comparable, we implement in this chapter the following API:
public class ST<Key extends Comparable<Key>, Value> ST() void put(Key key, Value val) Value get(Key key) void delete(Key key) boolean contains(Key key) boolean isEmpty() int size() Key min() Key max() Key floor(Key key) Key ceiling(Key key) int rank(Key key) Key select(int k) void deleteMin() void deleteMax() int size(Key lo, Key hi) Iterable<Key> keys(Key lo, Key hi) Iterable<Key> keys()

create an ordered symbol table put key-value pair into the table (remove key from table if value is null) value paired with key (null if key is absent) remove key (and its value) from table is there a value paired with key? is the table empty? number of key-value pairs smallest key largest key largest key less than or equal to key smallest key greater than or equal to key number of keys less than key key of rank k delete smallest key delete largest key number of keys in [lo..hi] keys in [lo..hi], in sorted order all keys in the table, in sorted order
API for a generic ordered symbol table
26

Binary search: ordered symbol table operations summary

sequential search search insert / delete min / max floor / ceiling rank select ordered iteration N N N N N N N lg N

binary search lg N N 1 lg N lg N 1 N

order of growth of the running time for ordered symbol table operations

27

3.1 S YMBOL T ABLES


! API ! elementary implementations

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

! ordered operations

Algorithms

R OBERT S EDGEWICK | K EVIN W AYNE

3.1 S YMBOL T ABLES


! API ! elementary implementations

Algorithms
F O U R T H E D I T I O N

! ordered operations

R OBERT S EDGEWICK | K EVIN W AYNE


http://algs4.cs.princeton.edu

You might also like