You are on page 1of 50

LECTURE 4

BINARY SEARCH TREES

LECTURER:
DIDAR YEDILKHAN
D.YEDILKHAN@ASTANAIT.EDU.KZ
LESSON AGENDA
1. Binary search trees theory - Basics
2. Binary search trees theory – Search, insert
3. Binary search trees theory - Delete
4. Binary search trees theory – In-order traversal
5. Binary search trees theory – Running times
6. Binary search trees implementation – Node class
7. Binary search trees implementation - Insertion
8. Binary search trees implementation – Traverse, min, max
9. Binary search trees implementation – Testing
10. Binary search trees implementation – Practical (real-world)
applications of trees
LESSON OUTCOMES
By the end of this lesson, you will be able to understand:
1. Basics of binary search trees including operations with binary search
trees (search, insert, delete, traverse)
2. Running time (complexity) of binary search trees
3. Implementation of binary search trees
4. Real-world (practical) applications of binary search trees
INTRODUCTION TO BST
In computer science, binary search trees, or simply BST (sometimes
also called ordered or sorted binary trees), is a note-based data
structure which has following properties: The left subtree of a node
contains only nodes with keys lesser than the node’s key, and the right
subtree of a node contains only nodes with keys greater than the node’s
key.

Definition 2: Binary search trees (BST) are a particular type of


container: a data structure that stores "items" (such as numbers, names
etc.) in memory. They allow fast lookup, addition and removal of items,
and can be used to implement either dynamic sets of items, or lookup
tables that allow finding an item by its key (e.g., finding the phone
number of a person by name).
BINARY SEARCH TREES
TREES
Trees – we have nodes with the data and connection between the
nodes (edges).
TREES
Important remark: In a tree: there must be only a single path from the
root node to any other nodes in the tree.

If there are several ways to get to a given node: it is not a tree


BINARY SEARCH TREES
• Every node can have at most two children: left and right child
• Left child: smaller than the parent
• Right child: greater than the parent
WHY BST IS GOOD?
On every decision we get rid of half of the data in which we are
searching (like in case with binary search trees), meaning the time
complexity is O(logN)
HEIGHT OF A TREE (BST)
Height of a tree is a number of layers it contains
TYPES OF BST
In general: if the time complexity of binary search tree is equal to
O(logN), it means that the tree is balanced

If it is not true, the tree is unbalanced, which means it is


asymmetric, which is a problem
HEIGHT OF A TREE (BST)
Height of the tree, h’: the length of the path from the root to the deepest
node in the tree
• We should keep the height of the tree at a minimum which is h = logN
• If the tree is unbalanced: the h=logN relation is no more valid and the
operation’s running time is no logarithmic
BINARY SEARCH TREES
• Binary search trees are data structures;
• Binary search trees keep the keys in sorted order: so that lookup and
other operations can use the principle of a binary search;
• Each comparison allows the operations to skip over half of the tree,
so that each lookup / insertion / deletion takes time proportional to the
logarithm of the number of items stored in the tree;
• This is much better than the linear time O(N) required to find items by
key in an unsorted array, but slower than the corresponding
operations on hash table (we will learn them later on).
BINARY SEARCH TREES
Sorted arrays Linked lists
Inserting a new item is quite Inserting a new item is very
slow – O(N) fast – O(1)

Searching is quite fast with Searching is sequential – O(N)


binary search – O(logN)
Removing an item is slow – Removing an item is fast
O(N) because of the references –
O(1)

Binary search trees are going to make all of these operations quite fast
with only O(logN) time complexity !!!
LOG N - EXPLAINED
Imagine a classroom of 100 students in which you gave your pen to one
person. Now, you want that pen.
1. You go and ask the first person of the class, if he has the pen. Also,
you ask this person about other 99 people if they have, This is what
we call O(N^2).
2. Going and asking each of them individually is O(N).
3. Now, divide the class in two groups. Ask which group has the pen,
let it be A. Now, divide this group A further in two parts B and C.
Again ask these two groups which have the pen. Take the group (B
or C which has the pen) and further divide it. Repeat the process till
you are left with 1 student who has you pen.
This is what you mean by O(logN).
LOG N - EXPLAINED
LOG N - EXPLAINED
https://www.youtube.com/watch?v=iP897Z5Nerk – Binary search
with Flamengo dance

https://hackernoon.com/what-does-the-time-complexity-o-log-n-act
ually-mean-45f94bb5bfbf
- What does the time complexity O(log n) actually mean?
OPERATIONS WITH BST: INSERT
We start at the root node. If the data we want to insert is greater than the
root node we go to the right, if it is smaller, we go to the left.

1. binarytree.insert (12)
2. binarytree.insert (4)
3. binarytree.insert (5)
4. binarytree.insert (20)
5. binarytree.insert (1)
OPERATIONS: INSERT INTO BST
We start at the root node. If the data we want to insert is greater than the
root node we go to the right, if it is smaller, we go to the left.

1. binarytree.insert (12)
2. binarytree.insert (4)
3. binarytree.insert (5)
4. binarytree.insert (20)
5. binarytree.insert (1)
OPERATIONS WITH BST: SEARCH
We start at the root node. If the data we want to insert is greater than the
root node we go to the right, if it is smaller, we go to the left.

binarytree.search (5)

On every decision: we discard half of the tree, so it is like binary search


in a sorted array – O (logN)
OPERATIONS WITH BST: SEARCH
If we want to find the smallest node: we just have to go to the left as far
as possible -> it will be the smallest one!!!

If we want to find the biggest node: we just have to go to the right as far
as possible -> it will be the greatest one!!!
OPERATIONS WITH BST: DELETE
Easiest way to perform deletion is “soft delete” -> we do not remove the
node from the BST, we just mark that it has been removed – not very
efficient.

There are three cases, if we want to get rid of the node:


1. The node we want to get rid of is a leaf node
2. The node we want to get rid of has a single child
3. The node we want to get rid of has 2 children

Let’s consider each case …


OPERATIONS WITH BST: DELETE
1. The node we want to get rid of is a leaf node

very simple, we just have to remove it (set it to null)

binarytree.remove (5)

Complexity: we have to find the item itself + we have to delete it or set it


to NULL – O(logN) find operation + O(1) deletion = O(logN)
OPERATIONS WITH BST: DELETE
2. The node we want to get rid of has a single child

also simple, we just have to update the references

binarytree.remove (1)

Complexity: we have to find the item we want to get rid of and we have
to update the references – set parent’s pointer point to it’s grandchild
directly. O(logN) find operation + O(1) update references = O(logN)
OPERATIONS WITH BST: DELETE
3. The node we want to get rid of has 2 children

not so simple, but still, we have two options:

a. We look for the largest item in the left subtree (predecessor)


b. We look for the smallest item in the right subtree (successor)
OPERATIONS WITH BST: DELETE
3. The node we want to get rid of has 2 children

not so simple, but still, we have two options:


OPERATIONS WITH BST: DELETE
3. The node we want to get rid of has 2 children

a. we look for the predecessor (or successor) and swap them


b. we end up at a case 1 – when we have to remove a leaf node
OPERATIONS WITH BST: DELETE
3. The node we want to get rid of has 2 children

What will happen in case if we swap with the successor?


OPERATIONS WITH BST: TRAVERSE
Sometimes it is necessary to visit every node in the tree. We can do it in
several ways:
a. In-order traversal (most used one)
b. Pre-order traversal
c. Post-order traversal
OPERATIONS WITH BST: TRAVERSE
a. In-order traversal – we visit the left subtree + the root node + the right
subtree recursively
OPERATIONS WITH BST: TRAVERSE
a. In-order traversal – we visit the left subtree + the root node + the right
subtree recursively

In order traversal: 1 -> 10 -> 16 -> 19 -> 23 -> 32 -> 55 -> 79 --- it can
be seen that the order is numerical
OPERATIONS WITH BST: TRAVERSE
b. Pre-order traversal – we visit the root + left subtree + the right
subtree recursively
OPERATIONS WITH BST: TRAVERSE
b. Pre-order traversal – we visit the root + left subtree + the right
subtree recursively

Pre-order traversal: 32 -> 10 -> 1 -> 19 -> 16 -> 23 -> 55 -> 79 ---
OPERATIONS WITH BST: TRAVERSE
c. Post-order traversal – we visit the left subtree + the right subtree +
the root recursively
OPERATIONS WITH BST: TRAVERSE
c. Post-order traversal – we visit the left subtree + the right subtree +
the root recursively

Post-order traversal: 1 -> 16 -> 23 -> 19 -> 10 -> 79 -> 55 -> 32


RUNNING TIMES
Average case Worst case
Insert O(logN) O(N)
Delete O(logN) O(N)
Search O(logN) O(N)

If the tree becomes unbalanced: the operations running time can be


reduced to O(N) in the worst case - > that is why it is important to keep a
tree as balanced as possible.
IMPLEMENTATION: NODE CLASS
IMPLEMENTATION: INSERT
IMPLEMENTATION: GET MIN AND MAX
IMPLEMENTATION: TRAVERSE
IMPLEMENTATION: TESTING
IMPLEMENTATION: REMOVE
IMPLEMENTATION: TESTING
TASK 1 WEEK 4
1. Build a binary search tree, insert at least 10 elements there and
delete all nodes from the binary tree
2. Build a binary tree and search a specific item from the binary tree and
print its parent node.
REAL-WORLD APPLICATIONS
Trees are extremely powerful if we want to represent hierarchical data
(file system)
-> operating systems
-> game trees (chess)
-> machine learning: decision trees and boosting uses very basic tree
structure
QUESTIONS
1. How many children can a node in binary trees have at most?
• 1
• 2
• 3
QUESTIONS
2. What does in-order traversal mean?
• Visit root node + visit right subtree + visit left subtree
• Visit root node + visit left subtree + visit right subtree
• Visit left subtree + visit root node + visit right subtree
QUESTIONS
3. Is it possible for a binary search tree to become unbalanced?
• Yes
• No
QUESTIONS
4. What does in-order traversal mean?
• Visit root node + visit right subtree + visit left subtree
• Visit root node + visit left subtree + visit right subtree
• Visit left subtree + visit root node + visit right subtree
QUESTIONS
5. What is the problem with unbalanced binary trees?
• Unbalanced binary trees use more memory
• The favorable O(logN) running time may even be reduced to O(N)
running time

You might also like