You are on page 1of 6

SAMYUKTHA SWAMINATHAN

M21AI635

B Tree
What is B-Tree?
A B-tree of order/degree d is a d-way tree such that:
1. All leaf nodes are at the same level
2. Nodes to the left are smaller than the parent, nodes to the right are larger than the
parent.
3. All non-leaf nodes (except root) have at most d and at least d/2 children
Why B-Tree?
1. Database is too big to fit in memory
2. Disk reads are slow
3. In order to implement dynamic multilevel indexing, B-tree is generally
employed.
General Properties:
1. B-Tree is a self-balancing search tree.
2. Data is stored at every level (In every node)
3. The best-case height of B-Trees is obtained putting maximum possible keys in a B-
Tree node.
4. The B-Tree node size is kept equal to the disk block size.
5. All leaves are at the same level.
6. It is the most widely used external memory data structure.
7. 2-3-4 trees are B-trees of order 4. They are an isometric of red-black trees.
8. The value of d depends upon disk block size.
9. All keys of a node are sorted in increasing order.
10. The child between two keys k1 and k2 contains all keys in the range from k1 and
k2.
11. B-Tree grows and shrinks from the root which is unlike Binary Search Tree. Binary
Search Trees grow downward and also shrink from downward.
12. Insertion of a Node in B-Tree happens only at leaf node.
13. The front compression and the rear compression are techniques used to reduce
space and time requirements in B-tree. The compression enables to retain more
keys in a node so that the number of nodes needed can be reduced.
14. Larger the order of B-tree, less frequently the split occurs.
SAMYUKTHA SWAMINATHAN
M21AI635

Complexities:

Algorithm Average Worst case


Space O(n) O(n)
Search O(log n) O(log n)
Insert O(log n) O(log n)
Delete O(log n) O(log n)

(B-tree and AVL tree have the same worst case time complexity for insertion and deletion)

Disk Accesses: Search/Insert/Delete/Max/Min etc = O(h) ; h – tree height


Tree Properties:

Degree/Order d
No of nodes in the tree n
Height of the tree h
No.of splits s
No. of. keys No . of. Children - 1
No. of. Children No. of. Keys + 1
Maximum no. of children for a root node d
Maximum no. of children for an internal d
node
Minimum no. of children for a root node 2
Minimum no. of children for an internal ceil(d/2)
node
Maximum no. of keys for root node Max children for root node – 1 = d – 1
Minimum no. of keys for root node Min children for root node – 1 = 2 – 1 = 1
Maximum no. of keys for internal nodes Max children for internal node – 1 = d – 1
Minimum no. of keys for internal nodes Min children for internal node – 1
= ceil(d/2) – 1
= (n-1)/2 if n is odd
= floor((n-1)/2) if n is even
Minimum height of the B-Tree ceil(logt(k+1)) – 1
ceil(logt(n+1)) – 1
where t = max no of children = d
Maximum height of the B-Tree floor(logt((n+1)/2))
where t = min no of children = ceil(d/2)
Maximum no. of keys in a B-Tree d(h+1) - 1
Minimum no. of keys in a B-Tree 1
No of nodes in the tree given ‘s’ splits are 2s + 1
done on insertion
The average probability of a split 1/min no of keys for internal nodes
= 1/(ceil(d/2) – 1)
SAMYUKTHA SWAMINATHAN
M21AI635

Advantages:
1. In most of the other self-balancing search trees (like AVL and Red-Black Trees), it is
assumed that everything is in main memory. The main idea of using B-Trees is to
reduce the number of disk accesses.
2. Since the height of the B-tree is low so total disk accesses for most of the
operations are reduced significantly compared to balanced Binary Search Trees like
AVL Tree, Red-Black Tree etc.
Disadvantages:
1. B-Tree is inefficient for batched problems like sorting/minimum reporting (priority
queue) / range searching / interval stabbing
2. B-Tree is inefficient because it works element wise and not block wise
3. B-Tree does not take advantage of the memory size d.

Applications:
1. Databases and file systems
2. To store blocks of data (secondary storage media)
3. Multilevel indexing

Visualization:
https://www.cs.usfca.edu/~galles/visualization/BTree.html

Tutorial:
Introduction - https://www.youtube.com/watch?v=94ErZ5K8XZg
Insertion (Order 3) - https://www.youtube.com/watch?v=Ay2AbTk_QEg
Insertion (Order 4) - https://www.youtube.com/watch?v=aNU9XYYCHu8
Insertion (Order 5) - https://www.youtube.com/watch?v=c7hXEFs69Jw
Insertion (Order 5 alphabets) - https://www.youtube.com/watch?v=asRbo5HWpxc
Deletion (Order 5) - https://www.youtube.com/watch?v=GKa_t7fF8o0

Example problems:
http://web.eecs.utk.edu/~bvanderz/teaching/cs140Sp18/BTrees/
https://courses.cs.washington.edu/courses/cse332/12wi/homework/hw4.pdf
https://nanopdf.com/download/b-tree-practice-problems_pdf
SAMYUKTHA SWAMINATHAN
M21AI635

B+ Tree
Why B+ Tree?

• It is an extension of B-Tree.
• In order, to implement dynamic multilevel indexing B+ tree is generally employed.
• The drawback of the B-tree is that it stores the data in all nodes. This technique,
greatly reduces the number of entries that can be packed into a node of a B-tree,
thereby contributing to the increase in the number of levels in the B-tree, hence
increasing the search time of a record.
• B+ tree eliminates the above drawback by storing data pointers only at the leaf
nodes of the tree.
Properties:
1. The leaf nodes store all the key values along with their corresponding data
pointers to the disk file block, in order to access them.
2. The leaf nodes form the first level of the index, with the internal nodes forming
the other levels of a multilevel index.
3. Some of the key values of the leaf nodes also appear in the internal nodes, to
simply act as a medium to control the searching of a record.
4. Left children are strictly less than the parent and right children are >= the parent.
(Or vice versa)
5. A B+ tree, unlike a B-tree has two orders, ‘a’ and ‘b’, one for the internal nodes
and the other for the external (or leaf) nodes.
6. Degree ‘a’ is for internal nodes and the properties are the same as degree ‘d’ of b
tree.
7. Degree ‘b’ is for leaf nodes.
8. Each leaf node has at least ceil(b/2) values.
9. All leaf nodes are at the same level.
SAMYUKTHA SWAMINATHAN
M21AI635

B-Tree vs B+Tree:

B Tree B+ Tree
Stores data at every level (every node) Stores data only in the leaves (Internal
nodes contain only keys and pointers)
Search keys cannot be repeatedly stored Redundant search keys can be present.
Searching for some data is a slower Searching is comparatively faster as data
process since data can be found on can only be found on the leaf nodes.
internal nodes as well as on the leaf nodes.
Deletion of internal nodes is complicated Deletion is not as complicated since the
and time consuming. element will always be deleted from the
leaf nodes.
Leaf nodes cannot be linked together. Leaf nodes are linked together to make the
search operations more efficient.

Complexities:
Algorithm Average Worst case
Space O(n) O(n)
Search O(log n) O(log n)
Insert O(log n) O(log n)
Delete O(log n) O(log n)

Advantages:
1. Buffer allows us to accumulate elements into blocks
2. Using buffers of size Θ(d), we fully utilize the memory
3. Operations are batched so Buffer Tree does not need to report the results
immediately
4. Due to large memory, Buffer Tree can postpone the disk writing to as late as
possible
5. Buffered Tree gets chance to carry out some operation in memory before writing
to disk.
6. A B+ tree with ‘l’ levels can store more entries in its internal nodes compared to a
B-tree having the same ‘l’ levels.
7. Since data is stored only in leaf node, significant improvement made to the search
time for any given key.
8. Having lesser levels and the presence of pointers in the leaf node forming a linked
list imply that the B+ tree is very quick and efficient in accessing records from
disks.
SAMYUKTHA SWAMINATHAN
M21AI635

Applications:
1. Databases and file systems
2. To store blocks of data (secondary storage media)
3. Multilevel indexing

Visualization:
https://www.cs.usfca.edu/~galles/visualization/BPlusTree.html

Tutorial: (Not covered in lectures)


Insertion - https://www.youtube.com/watch?v=DqcZLulVJ0M
Deletion - https://www.youtube.com/watch?v=pGOdeCpuwpI

You might also like