You are on page 1of 24

Application of B & B+ Tree in

Storage Allocation
Introduction(1)
 As we have seen already, database consists of tables, views,
index, procedures, functions etc.
 The tables and views are logical form of viewing the data. But
the actual data are stored in the physical memory .
 Database is a very huge storage mechanism and it will have
lots of data and hence it will be in physical storage devices –
like magnetic disk .
 In the physical memory devices, these data cannot be stored
as it is. They are converted to binary format.
 Each memory devices will have many data blocks, each of
which will be capable of storing certain amount of data.
 The data and these blocks will be mapped to store the data in
the memory.
Overview of a Secondary Storage-
Magnetic Disk
Access Data in Magnetic Disk(2)
 Traditional HDD has rotating drives
which stores data in tracks.
 When the data needs to be read or
written, the actuator with an arm, needs
to go to the particular sector on the
track to read or write a data. This is
measured as seek time.
 After that, the drive needs to rotate to
reach to a particular sector (rotational
latency).
 When we are dealing with huge amount
of data, it might become a bottleneck
since disk has to continuously move to a
specific sector.
 Average seek time vary from 4ms for
high end servers and 9ms for common
server.
How data Stored in Memory
• In linear data structure , for searching any elements in list, we have to visit all
elements.
• It is very slow and of O(n).
• So in memory, it is convenient that data is stored in non-linear data structures in
which data is organized in a hierarchical manner.
• A tree structure represents hierarchical relationship among its elements.
• It is very useful for information retrieval and searching in it is very fast.
Motivation
• We assume that everything in a search tree is kept within the main
memory (including the balanced trees like AVL, red-black trees, splay
trees, etc.).
• What if the data items contained in a search tree do not fit into the main
memory?
• Just think about searching in the UIDAI database (for AADHAAR details).
• Let us assume there is only 8 Bytes of data (say the AADHAAR ID) per
citizen and we have to create a search tree.
• The population of India: 1,358,856,931.
• The search tree will require more than 20 GB memory (including
pointers)!!!
Search Tree on disk
• A majority of the tree operations (search, insert, delete, etc.)
will require O(log2 n) disk accesses where n is the number of
data items in the search tree.
• The main challenge is to reduce the number of disk accesses.
• An m-ary search tree allows m-way branching.
• As branching increases, the depth decreases.
• A complete binary tree has a height of ┌ log2 n ┐.
• But a complete m-ary tree has a height of ┌ logm n ┐.
Cycles to access different types of
storage
Storage Type Access Type Number of Cycles

CPU registers Random 1

L2 cache Random 2

L2 cache Random 30

Main Memory Random 2.5 X 10^2

Hard Disk Random 3 X 10^7

Steam Line 5 X 10^3


Characteristics of B Tree
• B-Tree is a low-depth self-balancing tree.
• The height of a B-Tree is kept low by putting maximum
possible keys in a B-Tree node.
• Generally, the node size of a B-Tree is kept equal to the disk
block size.
What is B Tree
 Definition:-
A B-Tree of order m is an m-ary tree with the following properties:
 The data items are stored at leaves.
 The non-leaf nodes store up to m − 1 keys to guide the searching; The key
i represents the smallest key in subtree i + 1.
 The root is either a leaf or has between 2 and m children.
 All non-leaf nodes (except the root) have between ┌ m/2 ┐ and m
children.
 All leaves are at the same depth and have between ┌ k/2 ┐ and k data
items, for some k.
Searching
Insert 56 into tree
Delete
Delete
B+ Tree
• B+-trees are an important variant of B-trees.
• The performance of a B-tree depends heavily on the height of the tree.
• The deeper a tree, the more page lookups (on secondary storage) we need
to reach a leaf.
• So what can we do to “flatten” B-trees?
B+ Tree
• If we can increase the branching (number of pointers) in inner nodes, then
the tree will become “flatter”.
• Instead of storing data in inner nodes, we only store search keys (take up
less space ⇒ more room for pointers).
• We also link all the leaf nodes, allowing a fast sequential search.
Schema of B+ Tree
B+ Tree
 Definition
A B+-Tree of order m is an m-ary tree with the following properties:
 The data items are stored at leaves.
 The non-leaf nodes store up to m − 1 keys to guide the searching; The key i
represents the smallest key in subtree i + 1.
 The root is either a leaf or has between 2 and m children.
 All leaves are at the same depth and have up to k data items, for some k.
Example
Advantages
 Since all records are stored only in the leaf node and are sorted sequential
linked list, searching is becomes very easy.
 Using B+, we can retrieve range retrieval or partial retrieval. Traversing
through the tree structure makes this easier and quicker.
 As the number of record increases/decreases, B+ tree structure
grows/shrinks. There is no restriction on B+ tree size, like we have in ISAM.
 Since it is a balance tree structure, any insert/ delete/ update does not
affect the performance.
 Since we have all the data stored in the leaf nodes and more branching of
internal nodes makes height of the tree shorter. This reduces disk I/O.
Hence it works well in secondary storage devices.

You might also like