- Pre ILP Intial Exercise Set
- New Microsoft Word Document
- CollectionsFramework.pdf
- 1307.2982
- Immadi Pulikeshi
- Collections Framework
- SAP ABAP Data Dictionary Questions With Answers Part 2
- treeiimalecture
- Innodb Mysql
- THESIS Gerrit
- dbms lesson3.2
- HW1 RSP
- FRAME 2011 Users Manual
- Collection 2
- DATA MINING.epub
- Data Structure
- Artificial Intelligence Search Algorithms
- Session 7-Technique for Order Performance by Similarity to Ideal
- Iterative Methods
- Practice Midterm, ENGR 62, Stanford
- 2251-712X-8-26.pdf
- ps4_6245_2004.pdf
- Querry Processing
- ch6
- 106 2 Qualitative Reasoning
- _dea138dd95831712afcfb9d36eff6986_GraphBasedMethods
- Intro to Caching, Caching Algorithms and Caching Frameworks, Part 3
- Lecture 4 Scan Conversion Bresenhams Algorithm
- Statement of the problem
- IE 515 Case Study 2010 Fall
- Taller 2
- Viñetas
- Lenguajes de Programacion 1
- Taller Clase 1 (1)
- vacuna
- Notas
- Practica Validacion (1)
- Plantilla Diseño de Experimentos
- TABELA ncm medicamentos.xls
- Tablas.docx
- Ejercicio Practico 01 - Word
- Taller 3
- Tesis Tinta Continua
- practica1 (3)
- Evid
- code1
- Code
- code3
- Met Edatur
- MATLAB
- Laboratorio Ms Project
- Sistema Relevante
- Hotspot Mikrotik
- Prob
- Porque Normalizar
- matricula
- Entrevista a Paul Julian Smith
- bol8
- Conclu
- PARETO Matricula

Hashing II

6.006 Fall 2011

Lecture 9: Hashing II

Lecture Overview

• Table Resizing

• Amortization

• String Matching and Karp-Rabin

• Rolling Hash

Recall:

Hashing with Chaining:

all possible

keys

U

collisions

table

.k

.

.

k

k k.

.

k1

2

4

3

.

.

k1

k4

k2

}

n keys

in set DS

h

.

m slots

k3

expected size

α = n/m

**Figure 1: Hashing with Chaining
**

Expected cost (insert/delete/search): Θ(1 + α), assuming simple uniform hashing OR

universal hashing & hash function h takes O(1) time.

Division Method:

h(k) = k

mod m

where m is ideally prime

Multiplication Method:

h(k) = [(a · k) mod 2w ] (w − r)

where a is a random odd integer between 2w−1 and 2w , k is given by w bits, and m = table

size = 2r .

1

but averaged over all ops. r) =⇒ must rebuild hash table from scratch for item in old table: → for each slot. Rehashing: To grow or shrink table hash function must change (m. Amortized Analysis This is a common technique in data structures — like paying rent: $1500/month ≈ $50/day • operation has amortized cost T (n) if k operations cost ≤ k · T (n) • “T (n) amortized” roughly means T (n) “on average”. inserting into a hash table takes O(1) amortized time.Lecture 9 Hashing II 6.006 Fall 2011 How Large should Table be? • want m = Θ(n) at all times • don’t know how large n will get at creation • m too small =⇒ slow. but Θ(1) “on average”.g. say • m + =1? =⇒ rebuild every step =⇒ n inserts cost Θ(1 + 2 + · · · + n) = Θ(n2 ) • m ∗ =2? m = Θ(n) still (r+ =1) =⇒ rebuild at insertion 2i =⇒ n inserts cost Θ(1 + 2 + 4 + 8 + · · · + n) where n is really the next power of 2 = Θ(n) • a few inserts cost linear time. for item in slot insert into new table =⇒ Θ(n + m) time = Θ(n) if m = Θ(n) How fast to grow? When n reaches m. • e. 2 . m too big =⇒ wasteful Idea: Start small (constant) and grow (or shrink) as necessary.

g.pop in O(1) amortized 0 1 2 3 4 5 6 7 } } list unused Figure 2: Resizeable Arrays String Matching Given two strings s and t. does s occur as a substring of t? (and if so. Resizable Arrays: • same trick solves Python “list” (array) • =⇒ list.append and list. where and how many times?) E.006’ and t = your entire INBOX (‘grep’ on UNIX) Simple Algorithm: any(s == t[i : i + len(s)] for i in range(len(t) − len(s))) — O(|s|) time for each substring comparison =⇒ O(|s| · (|t| − |s|)) time = O(|s| · |t|) potentially quadratic 3 . • space can get big with respect to n e. shrink to half the size =⇒ O(1) amortized cost for both insert and delete — analysis is harder. s = ‘6. n× delete • solution: when n decreases to m/4.006 Fall 2011 Back to Hashing: Maintain m = Θ(n) =⇒ α = Θ(1) =⇒ support search in O(1) expected time (assuming simple uniform or universal hashing) Delete: Also O(1) expected as is.Lecture 9 Hashing II 6. see CLRS 17.4.g. n× insert.

Rolling Hash ADT Maintain string x subject to • r(): reasonable hash function h(x) on string x • r.skip(c): remove front letter from string x.006 Fall 2011 t s s Figure 3: Illustration of Simple Algorithm for the String Matching Problem Karp-Rabin Algorithm: • Compare h(s) == h(t[i : i + len(s)]) • If hash values match. likely so do strings – can check s == t[i : i + len(s)] to be sure ∼ cost O(|s|) – if yes.append(c) for c in t[:len(s)]: rt.Lecture 9 Hashing II 6...append(c): add letter c to end of string x • r. • expected time = O(|s| + |t| · cost(h)). • need suitable hash function. happened with probability < 1 |s| =⇒ expected cost is O(1) per i. – naively h(x) costs |x| – we’ll achieve O(1)! – idea: t[i : i + len(s)] ≈ t[i + 1 : i + 1 + len(s)]. This first block of code is O(|s|) 4 . found match — done – if no.append(c) if rs() == rt(): . assuming it is c Karp-Rabin Application: for c in s: rs.

skip(c): [u − ord(c) · (a|u|−1 mod p)] mod p = [(u mod p) − ord(c) · (a|x−1| mod p)] mod p 5 . Data Structure: Treat string x as a multidigit number u in base a where a denotes the alphabet size..006 Fall 2011 for i in range(len(s).. len(t)): rt.append(t[i]) if rs() == rt(): . 256 • r() = u mod p for (ideally random) prime p ≈ |s| or |t| (division method) • r stores u mod p and |x| (really a|x| ).. not u =⇒ smaller and faster to work with (u mod p fits in one machine word) • r.g.Lecture 9 Hashing II 6. The second block of code is O(|t|) + O(# matches − |s|) to verify. e.skip(t[i-len(s)]) rt.append(c): (u · a + ord(c)) mod p = [(u mod p) · a + ord(c)] mod p • r.

mit.006 Introduction to Algorithms Fall 2011 For information about citing these materials or our Terms of Use. . visit: http://ocw.MIT OpenCourseWare http://ocw.edu/terms.mit.edu 6.

- Pre ILP Intial Exercise SetUploaded byDeepak Mishra
- New Microsoft Word DocumentUploaded byCaleb Romero
- CollectionsFramework.pdfUploaded bySourav Das
- 1307.2982Uploaded bySal Ie Em
- Immadi PulikeshiUploaded byMallikarjun Mbr
- Collections FrameworkUploaded bygowri1111
- SAP ABAP Data Dictionary Questions With Answers Part 2Uploaded bySuresh Logu
- treeiimalectureUploaded bypoonam
- Innodb MysqlUploaded byPhạm Ngọc Trí
- THESIS GerritUploaded bytommeke102_21
- dbms lesson3.2Uploaded byphani
- HW1 RSPUploaded byMuhammad Yusuf Fadhlan
- FRAME 2011 Users ManualUploaded byjovanivan
- Collection 2Uploaded byHaythem Mohamed Ali
- DATA MINING.epubUploaded byCarlos Sanin Quiroz
- Data StructureUploaded bysvpriya233282
- Artificial Intelligence Search AlgorithmsUploaded by.xml
- Session 7-Technique for Order Performance by Similarity to IdealUploaded byVikram Santhanam
- Iterative MethodsUploaded byPiyush Kumar
- Practice Midterm, ENGR 62, StanfordUploaded bysuudsfiin
- 2251-712X-8-26.pdfUploaded byKrishnendu Shaw
- ps4_6245_2004.pdfUploaded bysergiovelasquezg
- Querry ProcessingUploaded byUttam Kesri
- ch6Uploaded byঅনীক সাহা জন
- 106 2 Qualitative ReasoningUploaded byg_sandy_reddy2482
- _dea138dd95831712afcfb9d36eff6986_GraphBasedMethodsUploaded bydrla4
- Intro to Caching, Caching Algorithms and Caching Frameworks, Part 3Uploaded byDmytro Shteflyuk
- Lecture 4 Scan Conversion Bresenhams AlgorithmUploaded byAtik Israk Lemon
- Statement of the problemUploaded byGaurav Kaushik
- IE 515 Case Study 2010 FallUploaded bySrikanth Kumar

- Taller 2Uploaded byNit_su
- ViñetasUploaded byNit_su
- Lenguajes de Programacion 1Uploaded byNit_su
- Taller Clase 1 (1)Uploaded byNit_su
- vacunaUploaded byNit_su
- NotasUploaded byNit_su
- Practica Validacion (1)Uploaded byNit_su
- Plantilla Diseño de ExperimentosUploaded byNit_su
- TABELA ncm medicamentos.xlsUploaded byNit_su
- Tablas.docxUploaded byNit_su
- Ejercicio Practico 01 - WordUploaded byNit_su
- Taller 3Uploaded byNit_su
- Tesis Tinta ContinuaUploaded byNit_su
- practica1 (3)Uploaded byNit_su
- EvidUploaded byNit_su
- code1Uploaded byNit_su
- CodeUploaded byNit_su
- code3Uploaded byNit_su
- Met EdaturUploaded byNit_su
- MATLABUploaded byNit_su
- Laboratorio Ms ProjectUploaded byNit_su
- Sistema RelevanteUploaded byNit_su
- Hotspot MikrotikUploaded byNit_su
- ProbUploaded byNit_su
- Porque NormalizarUploaded byNit_su
- matriculaUploaded byNit_su
- Entrevista a Paul Julian SmithUploaded byNit_su
- bol8Uploaded byNit_su
- ConcluUploaded byNit_su
- PARETO MatriculaUploaded byNit_su