You are on page 1of 8

TREE

A tree data structure is a hierarchical structure that is used to represent and organize data in a way that is easy to navigate and
search. It is a collection of nodes that are connected by edges and has a hierarchical relationship between the nodes.
The topmost node of the tree is called the root, and the nodes below it are called the child nodes. Each node can have multiple child
nodes, and these child nodes can also have their own child nodes, forming a recursive structure.
This data structure is a specialized method to organize and store data in the computer to be used more effectively. It consists of a
central node, structural nodes, and sub-nodes, which are connected via edges. We can also say that tree data structure has roots,
branches, and leaves connected with one another.

Why Tree is considered a non-linear data structure?(imp)


The data in a tree are not stored in a sequential manner i.e., they are not stored linearly. Instead, they are arranged on multiple
levels or we can say it is a hierarchical structure. For this reason, the tree is considered to be a non-linear data structure.

Basic Terminologies in Tree Data Structure:

 Parent Node: The node which is a predecessor


of a node is called the parent node of that
node. {B} is the parent node of {D, E}.
 Child Node: The node which is the immediate
successor of a node is called the child node of
that node. Examples: {D, E} are the child nodes
of {B}.
 Root Node: The topmost node of a tree or the
node which does not have any parent node is
called the root node. {A} is the root node of the
tree. A non-empty tree must contain exactly
one root node and exactly one path from the
root to all other nodes of the tree.
 Leaf Node or External Node: The nodes
which do not have any child nodes are called
leaf nodes. {K, L, M, N, O, P, G} are the leaf
nodes of the tree.
 Ancestor of a Node: Any predecessor nodes on the path of the root to that node are called Ancestors of that
node. {A,B} are the ancestor nodes of the node {E}
 Descendant: Any successor node on the path from the leaf node to that node. {E,I} are the descendants of the
node {B}.
 Sibling: Children of the same parent node are called siblings. {D,E} are called siblings.
 Level of a node: The count of edges on the path from the root node to that node. The root node has level 0.
 Internal node: A node with at least one child is called Internal Node.
 Neighbour of a Node: Parent or child nodes of that node are called neighbors of that node.
 Subtree: Any node of the tree along with its descendant.

Representation of Tree Data Structure:


A tree consists of a root, and zero or more subtrees T1, T2, … , Tk such that there is
an edge from the root of the tree to the root of each subtree.

Example of Tree data structure


Here,
 Node 1 is the root node
 1 is the parent of 2 and 3
 2 and 3 are the siblings
 4, 5, 6, and 7 are the leaf nodes
 1

and 2 are the ancestors of 5

Types of Tree data structures:


 Binary tree: In a binary tree, each node can have a maximum of two children linked to it. Some common types of
binary trees include full binary trees, complete binary trees, balanced binary trees, and degenerate or pathological
binary trees.
 Ternary Tree: A Ternary Tree is a tree data structure in which each node has at most three child nodes, usually
distinguished as “left”, “mid” and “right”.
 N-ary Tree or Generic Tree : Generic trees are a collection of nodes where each node is a data structure that consists
of records and a list of references to its children(duplicate references are not allowed). Unlike the linked list, each
node stores the address of multiple nodes.

Basic Operation of Tree Data Structure:


 Create – create a tree in the data structure.
 Insert − Inserts data in a tree.
 Search − Searches specific data in a tree to check whether it is present or not.
 Traversal:
 Preorder Traversal – perform Traveling a tree in a pre-order manner in the data structure.
 In order Traversal – perform Traveling a tree in an in-order manner.
 Post-order Traversal –perform Traveling a tree in a post-order manner.

Properties of Tree Data Structure:


 Number of edges: An edge can be defined as the connection between two nodes. If a tree has N nodes then it will
have (N-1) edges. There is only one path from each node to any other node of the tree.
 Depth of a node: The depth of a node is defined as the length of the path from the root to that node. Each edge adds 1
unit of length to the path. So, it can also be defined as the number of edges in the path from the root of the tree to the
node.
 Height of a node: The height of a node can be defined as the length of the longest path from the node to a leaf node
of the tree.
 Height of the Tree: The height of a tree is the length of the longest path from the root of the tree to a leaf node of the
tree.
 Degree of a Node: The total count of subtrees attached to that node is called the degree of the node. The degree of a
leaf node must be 0. The degree of a tree is the maximum degree of a node among all the nodes in the tree.

Application of Tree Data Structure:


 File System: This allows for efficient navigation and organization of files.
 Data Compression: Huffman coding is a popular technique for data compression that involves constructing a binary
tree where the leaves represent characters and their frequency of occurrence. The resulting tree is used to encode the
data in a way that minimizes the amount of storage required.
 Compiler Design: In compiler design, a syntax tree is used to represent the structure of a program.
 Database Indexing: B-trees and other tree structures are used in database indexing to efficiently search for and
retrieve data.

Advantages of Tree Data Structure:


 Tree offer Efficient Searching Depending on the type of tree, with average search times of O(log n) for balanced
trees like AVL.
 Trees provide a hierarchical representation of data, making it easy to organize and navigate large amounts of
information.
 The recursive nature of trees makes them easy to traverse and manipulate using recursive algorithms.
Disadvantages of Tree Data Structure:
 Unbalanced Trees, meaning that the height of the tree is skewed towards one side, which can lead to inefficient
search times.
 Trees demand more memory space requirements than some other data structures like arrays and linked lists,
especially if the tree is very large.
 The implementation and manipulation of trees can be complex and require a good understanding of the algorithms.

A binary tree is a tree-type non-linear data structure with a maximum of two children for each parent. Every node in a binary
tree has a left and right reference along with the data element. The node at the top of the hierarchy of a tree is called the root
node. The nodes that hold other sub-nodes are the parent nodes.A parent node has two child nodes: the left child and right child.
Hashing, routing data for network traffic, data compression, preparing binary heaps, and binary search trees are some of the
applications that use a binary tree.

 Node: It represents a termination point in a tree.


 Root: A tree’s topmost node.
 Parent: Each node (apart from the root) in a tree that has at
least one sub-node of its own is called a parent node.
 Child: A node that straightway came from a parent node when
moving away from the root is the child node.
 Leaf Node: These are external nodes. They are the nodes that
have no child.
 Internal Node: As the name suggests, these are inner nodes
with at least one child.
 Depth of a Tree: The number of edges from the tree’s node to
the root is.
 Height of a Tree: It is the number of edges from the node to
the deepest leaf. The tree height is also considered the root
height.

Say a binary tree placed at a height equal to 3. In that case, the highest number of nodes for this height 3 stands equal to 15, that
is, (1+2+4+8) = 15. In basic terms, the maximum node number possible for this height h is (20 + 21 + 22+….2h) = 2h+1 -1.

 Now, for the minimum node number that is possible at this height h, it comes as equal to h+1.
 If there are a minimum number of nodes, then the height of a binary tree would stand aa maximum. On the other hand,
when there is a number of a node at its maximum, then the binary tree m height will be minimum. If there exists around
‘n’ number nodes in a binary tree, here is a calculation to clarify the binary tree definition.
 The tree’s minimum height is computed as:
 n = 2h+1 -1
 n+1 = 2h+1
Taking log for both sides now,
 log2(n+1) = log2(2h+1)
 log2(n+1) = h+1
 h = log2(n+1) – 1
The highest height will be computed as:
 n = h+1
 h= n-1

In case there exists n number of nodes in any binary tree, the height stands given by log n logn. This is due to the simple reason
that for any given node in any binary tree, there will be two child nodes at the most. This drives us to an explanation to define
binary tree: for every level or every height of any binary tree, the node number must be the same as around half the node numbers
present at the next level.
In the form of an answer to define binary tree, the node number at every level is close to double the node number at its previous
level.
It means, that when a binary tree comes with height h, the number of nodes for define binary tree is n = (2 ^ 0) + (2 ^ 1) + (2 ^ 2)
+ (2 ^ 3) + ….. + (2 ^ (h-1))(20)+(21)+(22)+(23)+…..+(2(h−1))

From the `mathematical induction point for what is a binary tree, this is what we know-
(2 ^ 0) + (2 ^ 1) + (2 ^ 2) + (2 ^ 3) + ….. + (2 ^ {(h-1)}) = (2 ^ h)-1(20)+(21)+(22)+(23)+…..+(2(h−1))=(2h)−1

Hence,
(2 ^ h)-1 = n => 2 ^ h = n + 1 => h = log2(n+1)(2h)−1=n=>2h=n+1=>h=log2(n+1)

Therefore, the minimum height for a binary tree is roughly equal to log(n) roughly. This helps you better understand what is a
binary tree.Also, the minimum number of nodes that are possible at height h of the binary tree can be known by h+1.If the binary
tree comes with an L number for leaf nodes, the height is represented by L + 1.
Types of Binary Trees

1. Full Binary Tree:It is a special kind of a binary tree that has either zero children or
two children. It means that all the nodes in that binary tree should either have two child
nodes of its parent node or the parent node is itself the leaf node or the external node.
In other words, a full binary tree is a unique binary tree where every node except the
external node has two children. When it holds a single child, such a binary tree will not be
a full binary tree. Here, the quantity of leaf nodes is equal to the number of internal
nodes plus one. The equation is like L=I+1, where L is the number of leaf nodes, and I is
the number of internal nodes.

2. Complete Binary Tree


A complete binary tree is another specific type of binary tree where all the tree levels are
filled entirely with nodes, except the lowest level of the tree. Also, in the last or the lowest
level of this binary tree, every node should possibly reside on the left side. Here is the
structure of a complete binary tree:

3. Perfect Binary Tree


A binary tree is said to be ‘perfect’ if all the internal nodes have strictly two children, and
every external or leaf node is at the same level or same depth within a tree. A
perfect binary tree having height ‘h’ has 2h – 1 node. Here is the structure of a perfect
binary tree:

4. Balanced Binary Tree


A binary tree is said to be ‘balanced’ if the tree height is O(logN), where ‘N’ is the number of
nodes. In a balanced binary tree, the height of the left and the right subtrees of each node
should vary by at most one. An AVL Tree and a Red-Black Tree are some common examples of
data structure that can generate a balanced binary search tree. Here is an example of a
balanced binary tree:

A Binary Search Tree (BST) is a special type of binary tree in which the left child of a node has a value less than the node’s value
and the right child has a value greater than the node’s value. This property is called the BST property and it makes it possible to
efficiently search, insert, and delete elements in the tree.
The root of a BST is the node that has the smallest value in the left subtree and the largest value in
the right subtree. Each left subtree is a BST with nodes that have smaller values than the root and
each right subtree is a BST with nodes that have larger values than the root.
Binary Search Tree is a node-based binary tree data structure that has the following properties:
 The left subtree of a node contains only nodes with keys lesser than the node’s key.
 The right subtree of a node contains only nodes with keys greater than the node’s key.
 This means everything to the left of the root is less than the value of the root and
everything to the right of the root is greater than the value of the root. Due to this performing, a binary search is very
easy.
 The left and right subtree each must also be a binary search tree.
There must be no duplicate nodes(BST may have duplicate values with different handling approaches)

Various operations that can be performed on a BST:


 Insert a node into a BST : A new key is always inserted at the leaf. Start searching a key from the root till a leaf
node. Once a leaf node is found, the new node is added as a child of the leaf node.
Time Complexity: O(N), where N is the number of nodes of the BST
 Preorder traversal: Preorder traversal first visits the root node and then traverses the left and the right subtree. It is
used to create a copy of the tree. Preorder traversal is also used to get prefix expression on of an expression tree. Time
Complexity: O(N), where N is the number of nodes of the BST.
 Postorder traversal: Postorder traversal first traverses the left and the right subtree and then visits the root node. It is
used to delete the tree. In simple words, visit the root of every subtree last.
Time Complexity: O(N), where N is the number of nodes of the BST
 Level order traversal: Level order traversal of a BST is breadth first traversal for the tree. It visits all nodes at a
particular level first before moving to the next level. Time Complexity: O(N), where N is the number of nodes of the
BST

Applications of BST:
 Graph algorithms: BSTs can be used to implement graph algorithms, such as in minimum spanning tree algorithms.
 Priority Queues: BSTs can be used to implement priority queues, where the element with the highest priority is at the
root of the tree, and elements with lower priority are stored in the subtrees.
 Self-balancing binary search tree: BSTs can be used as a self-balancing data structures such as AVL tree and Red-
black tree.
 Data storage and retrieval: BSTs can be used to store and retrieve data quickly, such as in databases, where
searching for a specific record can be done in logarithmic time.

Advantages:
 Fast search: Searching for a specific value in a BST has an average time complexity of O(log n), where n is the
number of nodes in the tree. This is much faster than searching for an element in an array or linked list, which have a
time complexity of O(n) in the worst case.
 In-order traversal: BSTs can be traversed in-order, which visits the left subtree, the root, and the right subtree. This
can be used to sort a dataset.
 Space efficient: BSTs are space efficient as they do not store any redundant information, unlike arrays and linked
lists.
Disadvantages:
 Skewed trees: If a tree becomes skewed, the time complexity of search, insertion, and deletion operations will be
O(n) instead of O(log n), which can make the tree inefficient.
 Additional time required: Self-balancing trees require additional time to maintain balance during insertion and
deletion operations.
 Efficiency: BSTs are not efficient for datasets with many duplicates as they will waste space.

What is recursion?
In simple words, recursion is a problem solving, and in some cases, a programming technique that has a very special and
exclusive property. In recursion, a function or method has the ability to call itself to solve the problem. The process of recursion
involves solving a problem by turning it into smaller varieties of itself.

Recursion in stack in data structure is when functions call themselves directly or indirectly.
The process in which a function calls itself could happen directly as well as indirectly. This difference in call gives rise to
different types of recursion, which we will talk about a little later. Some of the problems that can be solved using recursion
include DFS of Graph, Towers of Hanoi, Different Types of Tree Traversals, and others.
Every step repeats itself at a smaller scale in recursion. Hence, all of them are combined to solve the problem. Recursion in the
data structure is one of the most compact and improved strategies for creating function calls or deploying functions in
programming. It helps you to implement many functions and algorithms efficiently, at the same time extending clarity while
executing a recursive algorithm in a code which data structure is used.

The biggest disadvantages of recursion are that there are greater space requirements for recursive functions as compared to an
iterative program because every function call has to remain in the stack until the base condition is met, and there are greater time
requirements, too, because the stack grows after each function call, and the final answer can only be returned after popping all the
elements from the stack.

The idea of threaded binary trees is to make inorder traversal faster and do it without stack and without recursion. A binary tree
is made threaded by making all right child pointers that would normally be NULL point to the inorder successor of the node (if it
exists).
A threaded binary tree is a type of binary tree data structure where the empty left and right child pointers in a binary tree are
replaced with threads that link nodes directly to their in-order predecessor or successor, thereby providing a way to traverse the tree
without using recursion or a stack.
Threaded binary trees can be useful when space is a concern, as they can eliminate the need for a stack during traversal. However,
they can be more complex to implement than standard binary trees.
There are two types of threaded binary trees.
Single Threaded: Where a NULL right pointers is made to point to
the inorder successor (if successor exists)
Double Threaded: Where both left and right NULL pointers are made to point to inorder predecessor and inorder successor
respectively. The predecessor threads are useful for reverse inorder traversal and postorder traversal.
The threads are also useful for fast accessing ancestors of a node.
Following diagram shows an example Single Threaded Binary Tree. The dotted lines represent threads.
0 seconds of 21 secondsVolume 0%

Advantages of Threaded Binary Tree


 In this Tree it enables linear traversal of elements.
 It eliminates the use of stack as it perform linear traversal, so save memory.
 Enables to find parent node without explicit use of parent pointer
 Threaded tree give forward and backward traversal of nodes by in-order fashion
 Nodes contain pointers to in-order predecessor and successor
 For a given node, we can easily find inorder predecessor and successor. So, searching is much more easier.
 In threaded binary tree there is no NULL pointer present. Hence memory wastage in occupying NULL links is
avoided.
 The threads are pointing to successor and predecessor nodes. This makes us to obtain predecessor and successor node
of any node quickly.
 There is no need of stack while traversing the tree, because using thread links we can reach to previously visited
nodes.
Disadvantages of Threaded Binary Tree
 Every node in threaded binary tree need extra information(extra memory) to indicate whether its left or right node
indicated its child nodes or its inorder predecessor or successor. So, the node consumes extra memory to implement.
 Insertion and deletion are way more complex and time consuming than the normal one since both threads and
ordinary links need to be maintained.
 Implementing threads for every possible node is complicated.
 Increased complexity: Implementing a threaded binary tree requires more complex algorithms and data structures than
a regular binary tree. This can make the code harder to read and debug.
 Extra memory usage: In some cases, the additional pointers used to thread the tree can use up more memory than a
regular binary tree. This is especially true if the tree is not fully balanced, as threading a skewed tree can result in a
large number of additional pointers.
 Limited flexibility: Threaded binary trees are specialized data structures that are optimized for specific types of
traversal. While they can be more efficient than regular binary trees for these types of operations, they may not be as
useful in other scenarios. For example, they cannot be easily modified (e.g. inserting or deleting nodes) without
breaking the threading.
 Difficulty in parallelizing: It can be challenging to parallelize operations on a threaded binary tree, as the threading
can introduce data dependencies that make it difficult to process nodes independently. This can limit the performance
gains that can be achieved through parallelism.

Applications of threaded binary tree –

 Expression evaluation: Threaded binary trees can be used to evaluate arithmetic expressions in a way that avoids
recursion or a stack. The tree can be constructed from the input expression, and then traversed in-order or pre-order to
perform the evaluation.
 Database indexing: In a database, threaded binary trees can be used to index data based on a specific field (e.g. last
name). The tree can be constructed with the indexed values as keys, and then traversed in-order to retrieve the data in
sorted order.
 Symbol table management: In a compiler or interpreter, threaded binary trees can be used to store and manage symbol
tables for variables and functions. The tree can be constructed with the symbols as keys, and then traversed in-order or
pre-order to perform various operations on the symbol table.
 Disk-based data structures: Threaded binary trees can be used in disk-based data structures (e.g. B-trees) to improve
performance. By threading the tree, it can be traversed in a way that minimizes disk seeks and improves locality of
reference.
 Navigation of hierarchical data: In certain applications, threaded binary trees can be used to navigate hierarchical data
structures, such as file systems or web site directories. The tree can be constructed from the hierarchical data, and then
traversed in-order or pre-order to efficiently access the data in a specific order.
AVL Tree is invented by GM Adelson - Velsky and EM Landis in 1962. The
tree is named AVL in honour of its inventors.AVL Tree can be defined as
height balanced binary search tree in which each node is associated with a
balance factor which is calculated by subtracting the height of its right sub-tree
from that of its left sub-tree.

Tree is said to be balanced if balance factor of each node is in between -1 to 1,


otherwise, the tree will be unbalanced and need to be balanced.AVL tree
controls the height of the binary search tree by not letting it to be skewed. The
time taken for all operations in a binary search tree of height h is O(h). However, it can be extended to O(n) if the BST becomes
skewed (i.e. worst case). By limiting this height to log n, AVL tree imposes an upper bound on each operation to be O(log n) where n
is the number of nodes.

Balance Factor (k) = height (left(k)) - height (right(k))

If balance factor of any node is 1, it means that the left sub-tree is one level higher than the right sub-tree.

If balance factor of any node is 0, it means that the left sub-tree and right sub-tree contain equal height.

If balance factor of any node is -1, it means that the left sub-tree is one level lower than the right sub-tree.

An AVL tree is given in the following figure. We can see that, balance factor associated with each node is in between -1 and +1.
therefore, it is an example of AVL tree.

Complexity

Algorithm Average case Worst case

Space o(n) o(n)

Search o(log n) o(log n)

Insert o(log n) o(log n)

Delete o(log n) o(log n)

Operations on AVL tree

Due to the fact that, AVL tree is also a binary search tree therefore, all the operations are performed in the same way as they are
performed in a binary search tree. Searching and traversing do not lead to the violation in property of AVL tree. However, insertion
and deletion are the operations which can violate this property and therefore, they need to be revisited.

S Operation Description
N

1 Insertion Insertion in AVL tree is performed in the same way as it is performed in a binary search tree. However, it may
lead to violation in the AVL tree property and therefore the tree may need balancing. The tree can be balanced
by applying rotations.

2 Deletion Deletion can also be performed in the same way as it is performed in a binary search tree. Deletion may also
disturb the balance of the tree therefore, various types of rotations are used to rebalance the tree.
B Tree VS B+ Tree

S B Tree B+ Tree
N

1 Search keys can not be repeatedly stored. Redundant search keys can be present.

2 Data can be stored in leaf nodes as well as internal nodes Data can only be stored on the leaf nodes.

3 Searching for some data is a slower process since data can be Searching is comparatively faster as data can only be found
found on internal nodes as well as on the leaf nodes. on the leaf nodes.

4 Deletion of internal nodes are so complicated and time Deletion will never be a complexed process since element
consuming. will always be deleted from the leaf nodes.

5 Leaf nodes can not be linked together. Leaf nodes are linked together to make the search
operations more efficient.

You might also like