Main Presentation

Application of B & B+ Tree in
Storage Allocation
Introduction(1)
 As we have seen already, database consists of tables, views,
index, procedures, functions etc.
 The tables and views are logical form of viewing the data. But
the actual data are stored in the physical memory .
 Database is a very huge storage mechanism and it will have
lots of data and hence it will be in physical storage devices –
like magnetic disk .
 In the physical memory devices, these data cannot be stored
as it is. They are converted to binary format.
 Each memory devices will have many data blocks, each of
which will be capable of storing certain amount of data.
 The data and these blocks will be mapped to store the data in
the memory.
Overview of a Secondary Storage-
Magnetic Disk
Access Data in Magnetic Disk(2)
 Traditional HDD has rotating drives
which stores data in tracks.
 When the data needs to be read or
written, the actuator with an arm, needs
to go to the particular sector on the
track to read or write a data. This is
measured as seek time.
 After that, the drive needs to rotate to
reach to a particular sector (rotational
latency).
 When we are dealing with huge amount
of data, it might become a bottleneck
since disk has to continuously move to a
specific sector.
 Average seek time vary from 4ms for
high end servers and 9ms for common
server.
How data Stored in Memory
• In linear data structure , for searching any elements in list, we have to visit all
elements.
• It is very slow and of O(n).
• So in memory, it is convenient that data is stored in non-linear data structures in
which data is organized in a hierarchical manner.
• A tree structure represents hierarchical relationship among its elements.
• It is very useful for information retrieval and searching in it is very fast.
Motivation
• We assume that everything in a search tree is kept within the main
memory (including the balanced trees like AVL, red-black trees, splay
trees, etc.).
• What if the data items contained in a search tree do not fit into the main
memory?
• Just think about searching in the UIDAI database (for AADHAAR details).
• Let us assume there is only 8 Bytes of data (say the AADHAAR ID) per
citizen and we have to create a search tree.
• The population of India: 1,358,856,931.
• The search tree will require more than 20 GB memory (including
pointers)!!!
Search Tree on disk
• A majority of the tree operations (search, insert, delete, etc.)
will require O(log2 n) disk accesses where n is the number of
data items in the search tree.
• The main challenge is to reduce the number of disk accesses.
• An m-ary search tree allows m-way branching.
• As branching increases, the depth decreases.
• A complete binary tree has a height of ┌ log2 n ┐.
• But a complete m-ary tree has a height of ┌ logm n ┐.
Cycles to access different types of
storage
Storage Type Access Type Number of Cycles
CPU registers Random 1
L2 cache Random 2
L2 cache Random 30
Main Memory Random 2.5 X 10^2
Hard Disk Random 3 X 10^7
Steam Line 5 X 10^3

Characteristics of B Tree
• B-Tree is a low-depth self-balancing tree.
• The height of a B-Tree is kept low by putting maximum
possible keys in a B-Tree node.
• Generally, the node size of a B-Tree is kept equal to the disk
block size.
What is B Tree
 Definition:-
A B-Tree of order m is an m-ary tree with the following properties:
 The data items are stored at leaves.
 The non-leaf nodes store up to m − 1 keys to guide the searching; The key
i represents the smallest key in subtree i + 1.
 The root is either a leaf or has between 2 and m children.
 All non-leaf nodes (except the root) have between ┌ m/2 ┐ and m
children.
 All leaves are at the same depth and have between ┌ k/2 ┐ and k data
items, for some k.
Searching
Insert 56 into tree
Delete
Delete
B+ Tree
• B+-trees are an important variant of B-trees.
• The performance of a B-tree depends heavily on the height of the tree.
• The deeper a tree, the more page lookups (on secondary storage) we need
to reach a leaf.
• So what can we do to “flatten” B-trees?
B+ Tree
• If we can increase the branching (number of pointers) in inner nodes, then
the tree will become “flatter”.
• Instead of storing data in inner nodes, we only store search keys (take up
less space ⇒ more room for pointers).
• We also link all the leaf nodes, allowing a fast sequential search.
Schema of B+ Tree
B+ Tree
 Definition
A B+-Tree of order m is an m-ary tree with the following properties:
 The data items are stored at leaves.
 The non-leaf nodes store up to m − 1 keys to guide the searching; The key i
represents the smallest key in subtree i + 1.
 The root is either a leaf or has between 2 and m children.
 All leaves are at the same depth and have up to k data items, for some k.
Example
Advantages
 Since all records are stored only in the leaf node and are sorted sequential
linked list, searching is becomes very easy.
 Using B+, we can retrieve range retrieval or partial retrieval. Traversing
through the tree structure makes this easier and quicker.
 As the number of record increases/decreases, B+ tree structure
grows/shrinks. There is no restriction on B+ tree size, like we have in ISAM.
 Since it is a balance tree structure, any insert/ delete/ update does not
affect the performance.
 Since we have all the data stored in the leaf nodes and more branching of
internal nodes makes height of the tree shorter. This reduces disk I/O.
Hence it works well in secondary storage devices.

Main Presentation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Main Presentation

Uploaded by

Copyright:

Available Formats

Application of B & B+ Tree in

CPU registers Random 1

Main Memory Random 2.5 X 10^2

Hard Disk Random 3 X 10^7

Steam Line 5 X 10^3

You might also like