Standard, Compressed and Suffix Tries

Uploaded by

Swaathi

0% found this document useful (0 votes)

69 views11 pages

Original Title

Tries

Copyright

Available Formats

PPT, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

69 views11 pages

Standard, Compressed and Suffix Tries

Uploaded by

Swaathi

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 11

Search inside document

Tries

• Standard Tries
• Compressed Tries
• Suffix Tries

1
Text Processing
• We have seen that preprocessing the pattern speeds up pattern
matching queries
• After preprocessing the pattern in time proportional to the pattern
length, the Boyer-Moore algorithm searches an arbitrary English text
in (average) time proportional to the text length
• If the text is large, immutable and searched for often (e.g., works by
Shakespeare), we may want to preprocess the text instead of the
pattern in order to perform pattern matching queries in time
proportional to the pattern length.
• Tradeoffs in text
searching

2
Standard Tries
• The standard trie for a set of strings S is an ordered tree such that:
– each node but the root is labeled with a character
– the children of a node are alphabetically ordered
– the paths from the external nodes to the root yield the strings of S
• Example: standard trie for
the set of strings
S = { bear, bell, bid, bull,
buy, sell, stock, stop }

•A standard trie uses O(n) space. Operations (find, insert, remove) take time
O(dm) each, where:
-n = total size of the strings in S,
-m =size of the string parameter of the operation 3
-d =alphabet size,
Applications of Tries
• A standard trie supports the following operations on a preprocessed text in
time O(m), where m = |X|
-word matching: find the first occurence of word X in the text
-prefix matching: find the first occurrence of the longest prefix of word X
in the text
• Each operation is performed by tracing a path in the trie starting at the
root

4
Compressed Tries
• Trie with nodes of degree at least 2
• Obtained from standard trie by compressing chains of redundant
nodes

Standard Trie:

Compressed Trie:

5
Compact Storage of Compressed
Tries
• A compressed trie can be stored in space O(s), where s = |S|, by using O(1)
space index ranges at the nodes

6
Insertion and Deletion
into/from a Compressed Trie

7
Suffix Tries
• A suffix trie is a compressed trie for all the suffixes of a text
Example:

Compact representation:

8
Properties of Suffix Tries
• The suffix trie for a text X of size n from an alphabet of size d
-stores all the n(n-1)/2 suffixes of X in O(n) space
-supports arbitrary pattern matching and prefix matching queries in
O(dm) time, where m is the length of the pattern
-can be constructed in O(dn) time

9
Tries and Web Search Engines
• The index of a search engine (collection of all searchable words) is stored
into a compressed trie
• Each leaf of the trie is associated with a word and has a list of pages (URLs)
containing that word, called occurrence list
• The trie is kept in internal memory
• The occurrence lists are kept in external memory and are ranked by
relevance
• Boolean queries for sets of words (e.g., Java and coffee) correspond to set
operations (e.g., intersection) on the occurrence lists
• Additional information retrieval techniques are used, such as
– stopword elimination (e.g., ignore “the” “a” “is”)
– stemming (e.g., identify “add” “adding” “added”)
– link analysis (recognize authoritative pages)
10
Tries and Internet Routers
• Computers on the internet (hosts) are identified by a unique 32-bit IP
(internet protocol) addres, usually written in “dotted-quad-decimal” notation
• E.g., www.cs.brown.edu is 128.148.32.110
• Use nslookup on Unix to find out IP addresses
• An organization uses a subset of IP addresses with the same prefix, e.g.,
Brown uses 128.148.*.*, Yale uses 130.132.*.*
• Data is sent to a host by fragmenting it into packets. Each packet carries the
IP address of its destination.
• The internet whose nodes are routers, and whose edges are communication
links.
• A router forwards packets to its neighbors using IP prefix matching rules.
E.g., a packet with IP prefix 128.148. should be forwarded to the Brown
gateway router.
• Routers use tries on the alphabet 0,1 to do prefix matching. 11

Indexing Structure and Huffman Coding
Document60 pages
Indexing Structure and Huffman Coding
Alemayehu Getachew
100% (2)
Kubad Abad Ki̇tabi - 07.11.2018 PDF
Document127 pages
Kubad Abad Ki̇tabi - 07.11.2018 PDF
Hasan
No ratings yet
Design of A Home Automation System Using Arduino PDF
Document10 pages
Design of A Home Automation System Using Arduino PDF
Nayome Devapriya Anga
No ratings yet
TCL and TK
Document60 pages
TCL and TK
Ganesh Nimmala
No ratings yet
Blockchain Unconfirmed Transaction Hack Scriptdocx PDF Free
Document4 pages
Blockchain Unconfirmed Transaction Hack Scriptdocx PDF Free
NadEop
No ratings yet
Python: - Guido Van Rossum
Document180 pages
Python: - Guido Van Rossum
Er Himanshu Singhal
No ratings yet
App Gateway PDF
Document296 pages
App Gateway PDF
Muhammad Majid Khan
No ratings yet
End of Year Test 1&2 Grade 6
Document5 pages
End of Year Test 1&2 Grade 6
Marina
No ratings yet
Getting Started with Python - An Introduction to Python Programming Language Basics
Document118 pages
Getting Started with Python - An Introduction to Python Programming Language Basics
yosia
No ratings yet
NumPy and Pandas for Data Science
Document60 pages
NumPy and Pandas for Data Science
john
No ratings yet
12 Tries
Document10 pages
12 Tries
Nikhil 20528
No ratings yet
Ch11 3 Tries
Document11 pages
Ch11 3 Tries
Aatika Fatima
No ratings yet
Trie Tree
Document21 pages
Trie Tree
Rajan Jaiprakash
No ratings yet
Indexing and Searching Techniques
Document56 pages
Indexing and Searching Techniques
priyankaprakasan
No ratings yet
Suf Tree
Document6 pages
Suf Tree
Mrunal Ruikar
No ratings yet
lecture01
Document28 pages
lecture01
demomento
No ratings yet
IR Chap7
Document30 pages
IR Chap7
biniam teshome
No ratings yet
Lecture 8 - Text Processing
Document10 pages
Lecture 8 - Text Processing
arnold.7800x3d
No ratings yet
Tries Data Structure
Document14 pages
Tries Data Structure
keshav
No ratings yet
Outline and Reading: Tries 4/1/2003 9:02 AM
Document3 pages
Outline and Reading: Tries 4/1/2003 9:02 AM
VenkataLakshmi Krishnasamy
No ratings yet
Advantages Relative To Other Search Algorithms
Document7 pages
Advantages Relative To Other Search Algorithms
Mrunal Ruikar
No ratings yet
ARRAYS, STRINGS, POINTERSclass PDF
Document28 pages
ARRAYS, STRINGS, POINTERSclass PDF
NJENGA Wagithi
No ratings yet
Module 06. String Algorithms Lecture 3-6
Document48 pages
Module 06. String Algorithms Lecture 3-6
Clash Of Clanes
No ratings yet
CS4311 Algorithms - Suffix Tree and Suffix Array for Text Indexing
Document16 pages
CS4311 Algorithms - Suffix Tree and Suffix Array for Text Indexing
Vikash Kumar
No ratings yet
Val. Trie Data Structure Makes
Document1 page
Val. Trie Data Structure Makes
harini
No ratings yet
7-Query Processing
Document47 pages
7-Query Processing
AdwaithAdwaithD
No ratings yet
6-Spelling Correction Soundex
Document52 pages
6-Spelling Correction Soundex
Mariem El Mechry
No ratings yet
Sysad313: Text Processing Filters in Linux: Objectives
Document6 pages
Sysad313: Text Processing Filters in Linux: Objectives
ErwinMacaraig
No ratings yet
Week 11 Lecture
Document61 pages
Week 11 Lecture
Sasi Dharan
No ratings yet
Static Tree Tables
Document29 pages
Static Tree Tables
Hamsa Veni
No ratings yet
Application of Tries
Document38 pages
Application of Tries
Tech_MX
No ratings yet
Introduction To Python
Document56 pages
Introduction To Python
Harinath Pspk
No ratings yet
Lossless Data Compression
Document24 pages
Lossless Data Compression
sojdeep_nagra24
No ratings yet
Unit Iii Data Structure
Document43 pages
Unit Iii Data Structure
Shushanth munna
No ratings yet
Trie
Document6 pages
Trie
Raj Dutta
No ratings yet
Unix Filterss
Document37 pages
Unix Filterss
Singareddy Vivekanandareddy
No ratings yet
EBUS622 - Week 5 - Lecture - Text Preparation
Document40 pages
EBUS622 - Week 5 - Lecture - Text Preparation
kulkarniakshay1402
No ratings yet
Unit - IV
Document30 pages
Unit - IV
Siva Kumar
No ratings yet
Algorithm String
Document185 pages
Algorithm String
Nguyễn Văn Thanh
No ratings yet
Lecture 8-QH
Document36 pages
Lecture 8-QH
Vĩnh Phong
No ratings yet
L12-Externalsorting Indexfiles PDF
Document50 pages
L12-Externalsorting Indexfiles PDF
Andrei Ardelean
No ratings yet
Datastructure
Document11 pages
Datastructure
PRADYUMNA KULKARNI
No ratings yet
Hash Tables and Query Execution Performance
Document32 pages
Hash Tables and Query Execution Performance
ruba71182
No ratings yet
CSI104 Summary
Document114 pages
CSI104 Summary
Duong Xuan Hoang Minh (K18 HL)
No ratings yet
Lec 6-String Processing
Document25 pages
Lec 6-String Processing
sheheryar
100% (1)
What Is A Trie???
Document8 pages
What Is A Trie???
AKHI
No ratings yet
Chapter 3 - String Processing
Document28 pages
Chapter 3 - String Processing
Tanveer Ahmed Hakro
No ratings yet
Trie Insertion
Document31 pages
Trie Insertion
Linux Things
No ratings yet
Intro to communication networks
Document13 pages
Intro to communication networks
Xin Zhao
No ratings yet
Algorithm Short Notes
Document11 pages
Algorithm Short Notes
Shruti Chandra
No ratings yet
SPELL Checking Techniques and Soundex Encoding
Document19 pages
SPELL Checking Techniques and Soundex Encoding
Samurai
No ratings yet
PLP 24
Document43 pages
PLP 24
ashokmohan87
No ratings yet
ENERGY 211 / CME 211: October 6, 2008
Document14 pages
ENERGY 211 / CME 211: October 6, 2008
danty
No ratings yet
BPSM
Document19 pages
BPSM
Jan Spruit
No ratings yet
Lecture03 Parsing 1
Document108 pages
Lecture03 Parsing 1
Nada Shaaban
No ratings yet
Text Processing Algorithms
Document58 pages
Text Processing Algorithms
Công Quân
No ratings yet
Text Compression
Document16 pages
Text Compression
Lekshmi A R
No ratings yet
Essential Skills For Bioinformatics
Document37 pages
Essential Skills For Bioinformatics
J. Perry Stonne
No ratings yet
PPT1 - Introduction To Data Structure
Document37 pages
PPT1 - Introduction To Data Structure
Guntur Wibisono
No ratings yet
Num Tech
Document39 pages
Num Tech
andres python
No ratings yet
Query Languages: Chapter Seven
Document36 pages
Query Languages: Chapter Seven
Sooraa
No ratings yet
Lec 02-04 Ch2 Numpy Part 1
Document66 pages
Lec 02-04 Ch2 Numpy Part 1
MAryam Khan
No ratings yet
06 Query Processing (2) - NDN
Document31 pages
06 Query Processing (2) - NDN
MUHAMMAD ABU RIJAL KUSNAEDI
No ratings yet
Tutorial GCDkit Ver 3.00
Document88 pages
Tutorial GCDkit Ver 3.00
Baso Rezki Maulana
No ratings yet
String Algorithms in C: Efficient Text Representation and Search
From Everand
String Algorithms in C: Efficient Text Representation and Search
Thomas Mailund
No ratings yet
Faxing in Google
Document2 pages
Faxing in Google
HN
No ratings yet
Winter Project
Document73 pages
Winter Project
ranjitkumar828
No ratings yet
Technical Guide For Webrtc
Document24 pages
Technical Guide For Webrtc
JosipProsvjed
No ratings yet
The Lived Experiences of College Students On Marketing Strategies of Umpc
Document13 pages
The Lived Experiences of College Students On Marketing Strategies of Umpc
Jenver Gavilo
No ratings yet
Introduction To Digital Marketing
Document16 pages
Introduction To Digital Marketing
areeza aijaz
No ratings yet
Cyber Attacks
Document1 page
Cyber Attacks
Shreyas Sanghvi
No ratings yet
Ja Veio
Document10 pages
Ja Veio
Fabiosabio2019 Silva
No ratings yet
Quiz VLSM
Document8 pages
Quiz VLSM
Celti
No ratings yet
Javascript Test II: Problem Statement
Document2 pages
Javascript Test II: Problem Statement
Ava White
No ratings yet
MCQ 231 SYBCOM Business-Communication
Document67 pages
MCQ 231 SYBCOM Business-Communication
Shinigami Gaming
No ratings yet
Marketing Types: Offline vs Online
Document2 pages
Marketing Types: Offline vs Online
Sarva Smart
100% (1)
Julian Klein: Food Rescue Hero, Pittsburgh, PA - Communications Intern
Document1 page
Julian Klein: Food Rescue Hero, Pittsburgh, PA - Communications Intern
Julian Klein
No ratings yet
Kerio Connect Adminguide en 7.1.3 2461
Document412 pages
Kerio Connect Adminguide en 7.1.3 2461
James Omara
No ratings yet
RSU Quiz 1: Effects of Globalization
Document4 pages
RSU Quiz 1: Effects of Globalization
Noel S. De Juan Jr.
No ratings yet
The Impact of ICT on HR
Document8 pages
The Impact of ICT on HR
Faruque Abdullah Rumon
No ratings yet
UCM6300 Usermanual
Document468 pages
UCM6300 Usermanual
ims
No ratings yet
TR Contact Center Services NC II
Document45 pages
TR Contact Center Services NC II
Frinces Marvida
No ratings yet
Overview of Multimedia Cloud
Document8 pages
Overview of Multimedia Cloud
IJRASETPublications
No ratings yet
TGBVPN CG Linksys Rv042 v131219 en
Document13 pages
TGBVPN CG Linksys Rv042 v131219 en
Edgar Olguin
No ratings yet
How social media shaped history
Document2 pages
How social media shaped history
Shane Li
No ratings yet
Rhce Interview Questions
Document2 pages
Rhce Interview Questions
surajpokhriyal21
No ratings yet
Customer Engagement Report 2008
Document57 pages
Customer Engagement Report 2008
siddharthmukund
No ratings yet
Network Layer Design Issues - Datagram vs Virtual Circuit Subnets
Document16 pages
Network Layer Design Issues - Datagram vs Virtual Circuit Subnets
Prateek Sancheti
100% (1)
Internet Control Message Protocol (Icmp) : Mcgraw-Hill ©the Mcgraw-Hill Companies, Inc., 2000
Document51 pages
Internet Control Message Protocol (Icmp) : Mcgraw-Hill ©the Mcgraw-Hill Companies, Inc., 2000
Rizqi Fajril
No ratings yet
Collecting Legally Defensible Online Evidence
Document25 pages
Collecting Legally Defensible Online Evidence
todd shipley
No ratings yet