CSC10004: Data Structures and Algorithms

CSC10004: Data Structures And Algorithms
TRIE
1
Group 3
20127297 Nguyen Ngoc Quang
20127315 Nguyen Chi Tai
20127515 Nguyen Hoang Quoc Huy
20127672 Vu Manh Quan
2
I. Introduction
II. Concept of Trie
TRIE III. Demo step by step
IV. Implementation
V. Life Application
3
I. Introduction
History:
•The idea was first abstractly described by Axel Thue in 1912.
•First described in a computer context by René de la Briandais in 1959.
•The idea was independently described in 1960 by Edward Fredkin, who coined
the term trie, pronouncing it /ˈtriː/ (as "tree"). (However, other authors pronounce it /ˈtraɪ/ (as
"try"), in an attempt to distinguish it verbally from "tree“)
•They are used to represent the “Retrieval” of data and thus the name Trie.
4
I. Introduction
Axel Thue Edward Fredkin
5
II. The concept of Trie (Prefix Tree)
• Trie - also called digital tree or prefix tree, is a type of search tree, a tree data
structure used for locating specific keys from within a set.
• These keys are most often strings. Links between nodes defined not by the entire
key, but by individual characters.
• A node's position in the trie defines the key with which it is associated. This
distributes the value of each key across the data structure and means that not
every node necessarily has an associated value.
→ All the children of a node have a common prefix of the string associated with
that parent node, and the root is associated with the empty string.
6
• In order to access a key (to recover its value, change it, or remove it), the trie is
traversed depth-first, following the links between nodes, which represent each
character in the key.
• Every node of Trie consists of multiple branches. Each branch represents a
possible character of keys. We need to mark the last node of every key as end of
word node. A Trie node field isEndOfWord is used to distinguish the node as end
of word node.
7
Insert:
•Every character of the input key is inserted as an individual Trie node.
•Note that the children is an array of pointers (or references) to next level trie
nodes.
•The key character acts as an index into the array children.
•If the input key is new or an extension of the existing key, we need to construct
non-existing nodes of the key, and mark end of the word for the last node.
•If the input key is a prefix of the existing key in Trie, we simply mark the last
node of the key as the end of a word. The key length determines Trie depth.
8
Search:
•Searching for a key is similar to insert operation, however, we only compare the
characters and move down.
•The search can terminate due to the end of a string or lack of key in the trie.
•In the former case, if the isEndofWord field of the last node is true, then the key
exists in the trie.
•In the second case, the search terminates without examining all the characters
of the key, since the key is not present in the trie.
9
T
T
H Nodes
E
Edges
TH TE
E
A
THE TEA
10
Root
TH E 1 T EA
Insert: “the”, “tea”
2
T
Search: “tea”
TH 3 TE 5
THE 4 TEA 6
11
IV. Implementations
void Trie::insert(Trie*& root, string key) {

if (root == nullptr) root = new Trie();
Trie* cur = root;
for (int i = 0; i < key.length(); i++) {
int index = key[i] - 'A’;
if (cur->character[index] == nullptr) {
Insertion of Trie
cur->character[index] = new Trie();
}
cur = cur->character[index];
}
cur->check_endofstr = 1;
}
12
IV. Implementations
bool Trie::search(Trie* root, string key) {

Trie* cur = root;
if (!cur) return 0;
for (int i = 0; i < key.length(); i++) {
int index = key[i] - 'A’;
Searching in Trie cur = cur->character[index];
if (!cur) return 0;
}
if (cur->check_endofstr == 1)return 1;
else return 0;
}
13
IV. Implementations
Complexity:
•Worst case search time complexity: Θ(key_length)
•Average case search time complexity: Θ(key_length)
•Best case search time complexity: Θ(1)
•Worst case insertion time complexity: Θ(key_length)
•Worst case deletion time complexity: Θ(key_length)
14
IV. Implementations
Advantage:
• With Trie, we can insert and find strings in O(L) time where L represent the length
of a single word.
→ Faster than BST. This is also faster than Hashing because of the ways it is
implemented. We do not need to compute any hash function. No collision
handling is required.
• Easily print all words in alphabetical order which is not easily possible with
hashing.
• We can efficiently do prefix search (or auto-complete) with Trie.
15
IV. Implementations
Disadvantage:
•The main disadvantage of tries is that they need a lot of memory for storing the
strings. For each node we have too many node pointers(equal to number of
characters of the alphabet).
=> The final conclusion is regarding tries data structure is that they are faster
but require huge memory for storing the strings.
16
V. Application
a. Auto Complete
• The most important application of Trie Data structure.
• An mechanism that predicts the rest of a word a user is typing based on the
string prefix. Users can select one of the various suggestions with tab spaces,
arrow keys etc.
→ Auto complete feature speeds up interactions between a user and the
application and improves user experience, especially when it accurately predicts
the word a user intends to use after the first few characters have been typed.
17
V. Application
Web Browser Source Code Editors, IDE
18
V. Application
b. Spell Checkers
•Spell checking or auto-correct is a three step process:
1. Check for the word in the data dictionary
2. Generate potential suggestions
3. Sort the suggestions with higher priority on top.
•Trie data structure is used to store the data dictionary and algorithms for
searching the words from the dictionary and provide the list of valid words for
suggestion can be constructed.
19
20

CSC10004: Data Structures and Algorithms

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CSC10004: Data Structures and Algorithms

Uploaded by

Copyright:

Available Formats

CSC10004: Data Structures And Algorithms

20127297 Nguyen Ngoc Quang

20127315 Nguyen Chi Tai

20127515 Nguyen Hoang Quoc Huy

20127672 Vu Manh Quan

II. Concept of Trie

TRIE III. Demo step by step

•The idea was first abstractly described by Axel Thue in 1912.

•First described in a computer context by René de la Briandais in 1959.

Axel Thue Edward Fredkin

void Trie::insert(Trie*& root, string key) {

bool Trie::search(Trie* root, string key) {

•Average case search time complexity: Θ(key_length)

•Best case search time complexity: Θ(1)

•Worst case insertion time complexity: Θ(key_length)

•Worst case deletion time complexity: Θ(key_length)

Web Browser Source Code Editors, IDE

You might also like