You are on page 1of 37

23AIE231M Introduction to AI & Data Science

Minor: Artificial Intelligence & Data Science


December 2023

Different Representations of Data I (Data Structures)


Unit 03: L10-L15

Puja Dutta, PhD Amit Agarwal, PhD


Assistant Professor, Civil Engineering Professor, Electrical & Electronics Engineering | Cybersecurity
+91 97432 94057 +91 98679 10690
https://www.amrita.edu/faculty/dr-puja-dutta https://www.amrita.edu/faculty/amit-agarwal
https://www.linkedin.com/in/amit-agarwal-635a548
I. Data Structures
a. Arrays & Matrices
b. Stacks
c. Queues
d. Linked Lists
e. Rooted Trees
f. Hash Tables
g. Binary Search Trees
h. Red-Black Trees
1a. DS: Arrays & Matrices [1/3]

Big-O, Big-theta & Big-Omega : 𝑶 𝒈 𝒏 , 𝜴 𝒈 𝒏 , 𝜣 𝒈 𝒏

If a fn 𝑓 𝑛 is 𝑂 𝑔 𝑛 then ∃ a const 𝐶 ∋ 𝑓 𝑛 ≤ 𝐶 ∗ 𝑔 𝑛 ∀ 𝑛 ≥ 𝑛0

If a fn 𝑓 𝑛 is 𝛺 𝑔 𝑛 then ∃ a const 𝐶 ∋ 𝑓 𝑛 ≥ 𝐶 ∗ 𝑔 𝑛 ∀ 𝑛 ≥ 𝑛0

If a function 𝑓 𝑛 is 𝛩 𝑔 𝑛 then ∃ constants 𝐶1 & 𝐶2 ∋ 𝐶1 𝑔 𝑛 ≤


𝑓 𝑛 ≤ 𝐶2 ∗ 𝑔 𝑛 ∀ 𝑛 ≥ 𝑛0

3 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1a. DS: Arrays & Matrices [2/3]
A data structure is an organization of data.
Useful data structures are those which facilitate time, memory or energy efficient methods of
storing, querying, retrieving, editing or deleting of the data.
Array: in most programming languages, an array is stored as a contiguous sequence of bytes.
If the 1st element of an array has an address 𝑎, index start at 𝑠 & each element occupies 𝑏 bytes
then the 𝑖 𝑡ℎ element occupies: 𝑎 + 𝑏 𝑖 − 𝑠 through 𝑎 + 𝑏 𝑖 − 𝑠 + 1 − 1 bytes.
E.g. if 𝑎 = 0, 𝑠 = 1, we have an array of float with 4 byte per element, total # of floats to be
stored is 10 then then 10th element occupies space between: 0 + 4 10 − 1 through 0 +
4 10 − 1 + 1 − 1, i.e., 36 byte to 39 byte.
It is assumed that a computer can access all RAM locations in constant time. Therefore, it takes
constant time to access any array element.
For programming languages which allow an array to hold elements of multiple sizes, the array
actually does not store data but pointers. In such cases, accessing array elements is not 𝑂 1 .

4 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1a. DS: Arrays & Matrices [3/3]

Four ways to store the matrix 𝑀:


(a) row-major order in a single array;
(b) column-major order in a single array;
(c) row-major order, with 1 array/row & a single array (blue) of pointers to the row array;
(d) column-major order, with 1 array/column & 1 array (blue) of pointers to the column array
If you do not have enough RAM storage then such as in case of satellite image processing,
virtual reality or simulation of a ship under ocean waves, you can use (c) or (d).
Alternatively, use a block representation in which the matrix is divided into blocks, and each
block is stored contiguously (a norm in parallel processing). E.g.

5 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


I. Data Structures
a. Arrays & Matrices
b. Stacks
c. Queues
d. Linked Lists
e. Rooted Trees
f. Hash Tables
g. Binary Search Trees
h. Red-Black Trees
1b. DS: Stacks [1/2]
A stack is a linear data structure (LDS) in which the insertion of a new element and removal of
an existing element takes place at the same end, the top of the stack. It implements LIFO.
INSERT (PUSH); DELETE (POP)
A queue is an LDS in which the insertion of a new element and removal of an existing element
takes place at opposite ends, at the top and bottom of the stacks. It implements FIFO.
Stack Implementation: An array S 1: 𝑛 implements a stack of ≤ 𝑛 elements. It has attribute:
𝑆. 𝑡𝑜𝑝 which indexes most recently inserted element & 𝑆. 𝑠𝑖𝑧𝑒 = 𝑛. Thus, S 1: 𝑆. 𝑡𝑜𝑝 .

7 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1b. DS: Stacks [2/2]
With reference to the figure in the last slide, stack elements appear only in the cream positions.
a) Stack 𝑆 has 4 elements. b) Stack S after the calls 𝑃𝑈𝑆𝐻 𝑆, 17 & 𝑃𝑈𝑆𝐻 𝑆, 3 . c) Stack 𝑆 after
the call 𝑃𝑂𝑃 𝑆 returns 3. Although element 3 still is in the array, it is no longer in the stack.
The top is element 17.
Each of the 3 stack operations in the inset takes 𝑂 1 time.

8 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


I. Data Structures
a. Arrays & Matrices
b. Stacks
c. Queues
d. Linked Lists
e. Rooted Trees
f. Hash Tables
g. Binary Search Trees
h. Red-Black Trees
1c. DS: Queues [1/2]
A queue has a HEAD & a TAIL.

The queue wraps around. When 𝑄. ℎ𝑒𝑎𝑑 = 𝑄. 𝑡𝑎𝑖𝑙,


the queue is empty.

A queue is implemented using array 𝑄 1: 12 . Its


elements appear only in the cream positions.
a) The queue has 5 elements 𝑄 7: 11 .
b) It shows the queue after ENQUEUE 𝑄, 17 ,
ENQUEUE 𝑄, 3 & ENQUEUE 𝑄, 5 .
c) The configuration of the queue after
DEQUEUE 𝑄 which returns 15 and moves the
pointer to array element 8.

10 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1c. DS: Queues [2/2]
The queue operations are shown in the figure.

Each of the queue operations takes 𝑂 1 time.

11 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


I. Data Structures
a. Arrays & Matrices
b. Stacks
c. Queues
d. Linked Lists
e. Rooted Trees
f. Hash Tables
g. Binary Search Trees
h. Red-Black Trees
1d. DS: Linked Lists [1/3]
A linked List is a generalization of arrays.
Why is it needed? You array may be too large
to be stored in a contiguous memory.
Or, pre-allocation may not work as a dynamic amount of memory, determined only at some
run-time stage, may be needed. Array memory allocation occurs at compile time; linked list
memory allocation occurs at run-time.
So, what is the solution?
1) Assign whatever contiguous memory you have in the form of an array.
2) Find another contiguous and allocate that too.
3) Repeat Step 2) until all required memory is allocated.
4) One must additionally store an element in each contiguous block that tells the address of the
previous contiguous array and, the next.
Problem: Are insertions and deletions faster in linked lists or arrays? Justify.
Problem: Is data retrieval faster in linked lists or arrays? Justify.
13 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security
1d. DS: Linked Lists [2/3]
Insertions and deletions in linked list in specific positions is 𝑂 𝑁Τ𝑘 where 𝑁 is the number of
elements in the linked list and 𝑘 is the number of linked lists.
Data retrieval is 𝑂 𝑙𝑜𝑔𝑁 if the linked list has sorted elements.

Problem: How much extra memory is needed by the linked list in comparison with an array for
storing the same amount of data? Show your calculations.

A linked list only stores the address of the next contiguous block.
A doubly linked list stores both, the address of the next contiguous block and the address of
the previous contiguous block.

Problem: Show by means of an example the advantage of using a doubly linked list.

A circular doubly linked list is a combination of circular and doubly linked lists.

CDLLs applications: playlist, browser history,


undo/redo operations, OS task scheduling …

14 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1d. DS: Linked Lists [3/3]

15 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


I. Data Structures
a. Arrays & Matrices
b. Stacks
c. Queues
d. Linked Lists
e. Rooted Trees
f. Hash Tables
g. Binary Search Trees
h. Red-Black Trees
1e. DS: Rooted Trees [1/3]
Linked lists work well for representing linear relationships, but not all relationships are linear.

We will learn about representing rooted trees with linked data structures
➢ Rooted binary trees
➢ Rooted trees in which each node can have an arbitrary # of children

1. Rooted Binary Trees

Attributes of a tree node 𝑥 are 𝑝, 𝑙𝑒𝑓𝑡, 𝑟𝑖𝑔ℎ𝑡.


Binary Tree
They are used to store pointers to the parent, left &
right child of each node in a binary tree, 𝑇.

For the root, 𝑥. 𝑝 = 𝑁𝐼𝐿. If a node 𝑥 does not have a


left-child (right-child) then 𝑥. 𝑙𝑒𝑓𝑡 (𝑥. 𝑟𝑖𝑔ℎ𝑡) = 𝑁𝐼𝐿.

If the tree’s root, 𝑇. 𝑟𝑜𝑜𝑡, is 𝑁𝐼𝐿, then tree 𝑇 is empty.

17 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1e. DS: Rooted Trees [2/3]
2. Rooted Trees with Unbounded Branching

If we have an unbounded branching factor then an impractical amount of memory will be


required (why?). Even if the branching factor is finite but large but if the average branching
factor is small, a significant amount of memory will be wasted (it will be 𝑶 𝑵𝒌 why?).

However there is a smart way of achieving this which takes 𝑂 𝑁 memory only. It is called left-
child, right-sibling representation for a rooted tree with unbounded branching:
Each node 𝑥 has only 2 pointers instead of pointers for each of its children.
𝑥. 𝑙𝑒𝑓𝑡 − 𝑐ℎ𝑖𝑙𝑑 points to the leftmost child of node 𝑥
𝑥. 𝑟𝑖𝑔ℎ𝑡 − 𝑠𝑖𝑏𝑙𝑖𝑛𝑔 points to the sibling of 𝑥 immediately to its right

If a node 𝑥 does not have any child then 𝑥. 𝑙𝑒𝑓𝑡 − 𝑐ℎ𝑖𝑙𝑑 = 𝑁𝐼𝐿.

If a node 𝑥 is the rightmost child of its parent then 𝑥. 𝑟𝑖𝑔ℎ𝑡 − 𝑠𝑖𝑏𝑙𝑖𝑛𝑔 = 𝑁𝐼𝐿.

18 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1e. DS: Rooted Trees [3/3]
2. Rooted Trees with Unbounded Branching

19 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


I. Data Structures
a. Arrays & Matrices
b. Stacks
c. Queues
d. Linked Lists
e. Rooted Trees
f. Hash Tables
g. Binary Search Trees
h. Red-Black Trees
1f. DS: Hash Tables [1/9]
Hash Table is a data structure to efficiently implement dictionaries – the
𝐼𝑁𝑆𝐸𝑅𝑇, 𝑆𝐸𝐴𝑅𝐶𝐻 & 𝐷𝐸𝐿𝐸𝑇𝐸 operations.

In the worst case, searching for an element in the hash table takes the same time as that in a
linked list, Θ 𝑛 , under some reasonable assumptions, it takes, on an average, only 𝑂 1 .

Information retrieval also takes 𝑂 1 . This requires


the use of direct addressing, i.e., each element in the
hash table must point to a specific location. When
#(keys − stored) ≪ # 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 = 𝑘𝑒𝑦𝑠 then the
hash table uses indirect instead of direct addressing
as a hash table typically uses an array of size
proportional to the number of keys actually stored.

Hash table effectively leverage the hierarchical


memory systems of a computer.

21 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1f. DS: Hash Tables [2/9]
1.1 Direct-address Hash Table it works well if # 𝑘𝑒𝑦𝑠 , in the universe of keys, 𝑈, is small, 𝐾 =
0, 1, … , 𝑚 − 1 . Each position corresponds to a key in 𝑈. See the figure.
Some positions will remain empty as illustrated.
𝑇 is a direct-access table, implemented as an array
𝑇[0: 𝑚 − 1].
𝑇 𝑘 returns the element pointed by key 𝑘. If
𝑇 𝑘 = 𝑁𝐼𝐿 then ∄ an element with key 𝑘.
All dictionary operations take 𝑂 1 :
- 𝐷𝐼𝑅𝐸𝐶𝑇 − 𝐴𝐷𝐷𝑅𝐸𝑆𝑆 − 𝑆𝐸𝐴𝑅𝐶𝐻
- 𝐷𝐼𝑅𝐸𝐶𝑇 − 𝐴𝐷𝐷𝑅𝐸𝑆𝑆 − 𝐼𝑁𝑆𝐸𝑅𝑇
- 𝐷𝐼𝑅𝐸𝐶𝑇 − 𝐴𝐷𝐷𝑅𝐸𝑆𝑆 − 𝐷𝐸𝐿𝐸𝑇𝐸

22 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1f. DS: Hash Tables [3/9]
1.2 (Indirect-address) Hash Table it is clear that direct-address hash
tables are impractical for large 𝑈 even if 𝐾 ≪ 𝑈 .
The storage requirement reduces from Θ 𝑈 to Θ 𝐾 .
However, while average-case search time remains 𝑂 1 , in
case of direct-address hash table, this is also for worst-case.
With direct addressing, an element with key 𝑘 is stored in
slot 𝑘, but with hashing, we use a hash function ℎ to
compute the slot number from the key 𝑘, so that the
element goes into slot ℎ 𝑘 .
ℎ: 𝑈 → 0,1, … , 𝑚 − 1 An element with key 𝑘 hashes to
slot ℎ 𝑘 . ℎ 𝑘 is the hash value of 𝑘.
Unfortunately, two or more keys may have the same hash
value. This is called collision. To minimize collisions, ℎ must
1) appear random but,
2) must be deterministic, i.e., same input must always produce the same output.
23 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security
1f. DS: Hash Tables [4/9]
Problem: what is the minimum number of collisions?
Solution: 1 if 𝑈 > 𝑚.
As collisions are unavoidable, the goal of hashing algorithm design is to:
1) design a hash function that minimizes the number of collisions (such a function must appear
random but be deterministic) &
2) have a method for resolving the collisions when they occur
Examples of Hashing Algorithms: CRC-32, MD5, SHA-1
Two approaches to resolve collisions:
- open addressing (aka closed hashing): once a collision is discovered, alternative locations in
the array 0,1, … , 𝑚 − 1 are searched and one of the collision elements is moved there. This
works well if # 𝑐𝑜𝑙𝑙𝑖𝑠𝑖𝑜𝑛𝑠 ≪ # 𝑒𝑚𝑝𝑡𝑦 𝑐𝑒𝑙𝑙𝑠 𝑖𝑛 𝑎𝑟𝑟𝑎𝑦
- separate chaining: once a collision occurs, build a separate linked list for each set of collided
items which can be traversed to access the item with a unique search key.

24 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1f. DS: Hash Tables [5/9]
A balanced tree will be a better
implementation instead of a linked list in case
𝑛
the load factor, , i.e., #[entries occupied in
𝑚
hash table]  size of array, is ≫ 1.
In practice, a hash table is resized if the load
factor is too low or too high.

The image shows that for small load factor, open


addressing is more efficient* than separate chaining
for resolving collisions**.
*in terms of cache look-up misses
**linear probing is an instance of open addressing
25 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security
1f. DS: Hash Tables [6/9]
Now we look at separate chaining in greater details.

Independent Uniform Hashing seemingly random function ℎ has, ∀ input 𝑘 ∈ 𝑈, an output


ℎ 𝑘 ∋ ∀ subsequent call to ℎ with the same input 𝑘, one gets the same output ℎ 𝑘 .

Insertion time of a key not already in


the table, is 𝑂 1 . Else first search if
𝑥. 𝑘𝑒𝑦 is already in the hash table.
𝑛
This takes 𝑂 , i.e., 𝑂 𝛼 time if
𝑚
table structure is an array, or,
𝑛
𝑂 𝑙𝑜𝑔2 if it is a balanced tree.
𝑚

Deletion takes 𝑂 1 time if the list is a doubly-linked list. If it were a singly-linked list then it
𝑛
will take 𝑂 time.
𝑚

26 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1f. DS: Hash Tables [7/9]
𝑛
Analysis of hashing with chaining: this will be in terms of 𝛼 which is load factor, which is .
𝑚

From the previous slide, insertion dominates.


In the extreme, all keys map to the same hash value. In that case, if list was implemented as an
𝑛 𝑛
array, it is 𝛩 =𝛩 = 𝛩 𝑛 , i.e.,𝛩 𝛼 .
𝑚 1

Problem: If array length is 𝑚 and key hashes are independent then what is the probability that
2 distinct keys collide?
Solution: A priori, a key 𝑘𝑒𝑦1 goes into any of the 𝑚 cells.
Similarly, key 𝑘𝑒𝑦2 goes into to any of the 𝑚 cells (and, so on).
For a collision to happen, 𝑘𝑒𝑦2 must go into the same sell as 𝑘𝑒𝑦1 was already in. The
1
probability of this is as 𝑘𝑒𝑦2 could have gone into any of the 𝑚 cells.
𝑚
𝑛
For each cell, 𝑗, the length of list is T 𝑗 is 𝑛𝑗 . 𝑈 = 𝑛0 + 𝑛1 + ⋯ +𝑛𝑚−1 = 𝑛. E 𝑛𝑗 = 𝛼 = .
𝑚

27 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1f. DS: Hash Tables [8/9]
Problem: In a hash table in which collisions are resolved by chaining, an unsuccessful search
𝛩 1 + 𝛼 time on average, under the assumption of independent uniform hashing.
Proof:
Assertion 1: ∵ it is an unsuccessful search, the hash value has not been computed (else, it could
have been retrieved from another storage). Computing hash value, i.e., ℎ 𝑘 takes 𝛩 1 .
Assertion 2: Using ℎ 𝑘 , we index into a list. This also takes 𝛩 1 time as ℎ 𝑘 maps to a
specific RAM address.
Thus, so far we have 𝛩 1 + 𝛩 1 = 𝛩 2 = 𝛩 1 time. NOTE: here we have assumed that the
two different-colored order notations are equivalent. They are generally not as in the former case,
you are computing hash values which is farm more expensive than locating a memory element in
a solid state RAM.
Assertion 3: If collisions are stored as a list (why as a list & not as an array?) then one will need,
to search through the full list corresponding to the hash value ℎ 𝑘 . This will take take time
proportional to the length of the list which, is expected to be 𝛼 for good hashing algorithms.
Thus, checking the list takes 𝛩 𝛼 time.
However, if the list was stored as a balance tree, it will take 𝛩 𝑙𝑜𝑔2 𝛼 instead of 𝛩 𝛼 time.
28 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security
1f. DS: Hash Tables [9/9]
Thus, total time spent in an unsuccessful search is 𝛩 1 + 𝛼 if collisions are stored as lists or
𝛩 1 + 𝑙𝑜𝑔2 𝛼 if stored as a balanced tree.
Problem: for the same problem, how much time is needed if collisions were as an array?
Solution: DIY
Problem: Use arguments used in the last problem to prove average time complexity of arriving
at a successful hit for an element in a hash table that implements separate chaining.
Solution: DIY

29 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


I. Data Structures
a. Arrays & Matrices
b. Stacks
c. Queues
d. Linked Lists
e. Rooted Trees
f. Hash Tables
g. Binary Search Trees
h. Red-Black Trees
1g. DS: Binary Search Trees (BST) [1/8]
Basic operations a BST data structure must support are: 𝐼𝑁𝑆𝐸𝑅𝑇, 𝑆𝐸𝐴𝑅𝐶𝐻,
𝐷𝐸𝐿𝐸𝑇𝐸, 𝑀𝐼𝑁, 𝑀𝐴𝑋, 𝑃𝑅𝐸𝐷𝐸𝐶𝐸𝑆𝑆𝑂𝑅 & 𝑆𝑈𝐶𝐶𝐸𝑆𝑆𝑂𝑅.
These operations are the same for both, a dictionary and a priority queue. Thus, a BST can be
used to implement both data structures.
All basic operations on a BST can be can be executed in 𝛩 𝑙𝑜𝑔2 𝑛 time. The same operations
are 𝛩 𝑛 time instead if one uses lists. A variation of BST called red-black tree takes 𝑂 𝑙𝑜𝑔2 𝑛
time, which is ≤ 𝛩 𝑙𝑜𝑔2 𝑛 time.
Expected height of a BST built on a random set of keys is 𝑂 𝑙𝑜𝑔2 𝑛 .
Each BST node comprises:
➢ Key
➢ Data Worst case running time of a BST is
proportional to the height of the tree.
➢ A pointer to parent
➢ A pointer to left child
➢ A pointer to right child
31 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security
1g. DS: Binary Search Trees (BST) [2/8]
Two BSTs with the same
elements. One is more efficient
than the other.
Also shown is the data is stored
at each node.
𝑃𝐴𝑅𝐸𝑁𝑇 𝑇. 𝑟𝑜𝑜𝑡 = 𝑁𝐼𝐿
Key storing property: Let 𝑥 be a
node in a BST.
If 𝑦 is a node in the left subtree
of 𝑥, then 𝑦. 𝑘𝑒𝑦 ≤ 𝑥. 𝑘𝑒𝑦.
If 𝑦 is a node in the right subtree
of 𝑥, then 𝑦. 𝑘𝑒𝑦 ≥ 𝑥. 𝑘𝑒𝑦.

32 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1g. DS: Binary Search Trees (BST) [3/8]
Animations of 𝐼𝑁𝑂𝑅𝐷𝐸𝑅 − 𝑇𝑅𝐸𝐸 − 𝑊𝐴𝐿𝐾 , 𝑃𝑅𝐸𝑂𝑅𝐷𝐸𝑅 − 𝑇𝑅𝐸𝐸 − 𝑊𝐴𝐿𝐾 & 𝑃𝑂𝑆𝑇𝑂𝑅𝐷𝐸𝑅 −
𝑇𝑅𝐸𝐸 − 𝑊𝐴𝐿𝐾
Complexity of 𝐼𝑁𝑂𝑅𝐷𝐸𝑅 − 𝑇𝑅𝐸𝐸 − 𝑊𝐴𝐿𝐾 is Θ 𝑛

33 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1g. DS: Binary Search Trees (BST) [4/8]
Queries on a BST: 𝑀𝐼𝑁, 𝑀𝐴𝑋,
𝑆𝑈𝐶𝐶𝐸𝑆𝑆𝑂𝑅, 𝑃𝑅𝐸𝐷𝐸𝐶𝐸𝑆𝑆𝑂𝑅 &
𝑆𝐸𝐴𝑅𝐶𝐻
Each of these queries take 𝑂 ℎ
where ℎ is the height of the BST
Given a pointer 𝑥 to the root of
the subtree and a key 𝑘 ,
𝑇𝑅𝐸𝐸 − 𝑆𝐸𝐴𝑅𝐶𝐻 𝑥, 𝑘 returns a
pointer to the node with key 𝑘.
If you want to search the entire
BST then call 𝑇𝑅𝐸𝐸 −
𝑆𝐸𝐴𝑅𝐶𝐻 𝑇. 𝑟𝑜𝑜𝑡, 𝑘

34 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


1g. DS: Binary Search Trees (BST) [5/8]
Min & Max: To find an element in a binary search tree whose key is a minimum, just follow left
child pointers from the root until you encounter a NIL.
By construction, the BST
construction process guarantees the
𝑇𝑅𝐸𝐸 − 𝑀𝐼𝑁 & 𝑇𝑅𝐸𝐸 − 𝑀𝐴𝑋
procedures are correct.

Successor & Predecessor: the successor of a node is the next node visited in an inorder tree
walk. The structure of a binary search tree allows you to determine the successor of a node
without comparing keys.

35 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security


I. Data Structures
a. Arrays & Matrices
b. Stacks
c. Queues
d. Linked Lists
e. Rooted Trees
f. Hash Tables
g. Binary Search Trees
h. Red-Black Trees
1h. Red-Black Trees [1/7]
TBD if time permits

37 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security

You might also like