You are on page 1of 77

Module III

Mr. Balaji N

An Object
Oriented Model for
Implementing
Cosequential Files
Module III Matching Names in
Two Lists
Merging Two Lists
Cosequential Processing and the Sorting of Large Files Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging

Mr. Balaji N Multilevel Indexing


and B-Trees
Introduction: The
Invention of B-Tree
Department of Information Science and Engineering AVL Trees
Sahyadri College of Engineering and Management Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
April 10, 2019 B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Matching Names in Two Lists
Mr. Balaji N

An Object
I Names common to the two lists - match operation or Oriented Model for
Implementing
an intersection. Cosequential Files
Matching Names in
I Not allow duplicate names within a list and that list are Two Lists
Merging Two Lists
sorted in ascending order. Application of the
Model to a General
Ledger Problem
I Start by reading in the initial item from each list, and Extension of the
Model to Include
Multiway Merging
we find that they match.
Multilevel Indexing
I Match set or intersection - output this first item as a and B-Trees
Introduction: The
member. Invention of B-Tree
AVL Trees
Paged Binary Tree
I We then read in the next item from each list. This time Multilevel Indexing:
B-Tree Indexes
the item in List 2 is less than the item in List 1. An Object Oriented
Representation of
B-Tree
I Ex: match the item CARTER from List 1 and scan B-Tree Methods:
Search, Insert and
Others
down List 2 until we either find it or jump beyond it, B-Tree Nomenclature
Formal Definition of
and continue the process. B-Tree Properties
Worst-Case Search
Depth
I Eventually we come to the end of one of the lists. Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Matching Names in Two Lists
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
I Initializing - we need to arrange things in such a way Matching Names in
Two Lists
that the procedure gets going properly. Merging Two Lists
Application of the
I Getting and accessing the next list item - we need Model to a General
Ledger Problem
Extension of the
simple methods that support getting the next list Model to Include
Multiway Merging
element and accessing it. Multilevel Indexing
and B-Trees
I Synchronizing - we have to make sure that the current Introduction: The
Invention of B-Tree
item from one list is never so far ahead of the current AVL Trees
Paged Binary Tree
item on the other list that a match will be missed. Multilevel Indexing:
B-Tree Indexes
An Object Oriented
I Handling end-of-file conditions - when we get to the end Representation of
B-Tree

of either List 1 or List 2, we need to halt the program. B-Tree Methods:


Search, Insert and
Others
I Recognizing errors - when an error occurs in the data, B-Tree Nomenclature
Formal Definition of
B-Tree Properties
we want to detect it and take some action. Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Matching Names in Two Lists
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
I At each step in the processing of two lists, we can Merging Two Lists
Application of the
Model to a General
assume that we have two items to compare: a current Ledger Problem
Extension of the
item from List 1 and a current item from List 2. Model to Include
Multiway Merging

I Compare the two items to determine whether Item(1) is Multilevel Indexing


and B-Trees
less than, equal to, greater than Item(2): Introduction: The
Invention of B-Tree
I If(Item(1) < Item(2)), we get the next item from List 1; AVL Trees
Paged Binary Tree
I If(Item(1) > Item(2)), we get the next item from List 2; Multilevel Indexing:
B-Tree Indexes
and An Object Oriented
Representation of
I If item are the same, we output the item and get the B-Tree
B-Tree Methods:
next items from the two lists. Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
int Match (char * List1Name, char * List2Name, char & Module III

OutputListName) Mr. Balaji N

{ int MoreItems;
An Object
InitializeList (1, List1Name); // initialize List 1 Oriented Model for
Implementing
InitializeList (2, List2Name); // initialize List 2 Cosequential Files
Matching Names in
InitializeOutput (OutputListName); Two Lists
Merging Two Lists
MoreItems = NextItemInList(1) && NextItemInList(2); Application of the
Model to a General
while (MoreItems) Ledger Problem
Extension of the
Model to Include
{ if (Item(1)<Item(2)) Multiway Merging

MoreItems = NextItemInList(1); Multilevel Indexing


and B-Trees
else if (Item(1)==Item(2)) Introduction: The
Invention of B-Tree

{ ProcessItem(1); // match found AVL Trees


Paged Binary Tree

MoreItems = NextItemInList(1) && Multilevel Indexing:


B-Tree Indexes
An Object Oriented
NextItemInList(2); } else Representation of
B-Tree
if(Item(1)>Item(2)) B-Tree Methods:
Search, Insert and
Others
MoreItem = NextItemInList(2); B-Tree Nomenclature
Formal Definition of
} B-Tree Properties
Worst-Case Search
FinishUp(); Depth
Deletion, Merging and
Redistribution
return 1; Redistribution
Redistribution During
} Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Merging Two Lists
Mr. Balaji N

An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
I The tree-way-step, single-loop model for cosequential Model to a General
Ledger Problem
Extension of the
processing can easily be modified to handle merging of Model to Include
Multiway Merging
lists. Multilevel Indexing
and B-Trees
I Difference between matching and merging is that with Introduction: The
Invention of B-Tree
merging we must read completely through each of the AVL Trees
Paged Binary Tree
lists. Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Application of the Model to a General Ledger
Mr. Balaji N
Problem
An Object
The Problem Oriented Model for
Implementing
I Designing a general ledger posting program as part of Cosequential Files
Matching Names in
an accounting system. Two Lists
Merging Two Lists
I This includes a journal file and a ledger file. Application of the
Model to a General
Ledger Problem
I Ledger file contains month-by-month summaries of the Extension of the
Model to Include
values associated with each of the bookkeeping Multiway Merging

Multilevel Indexing
accounts. and B-Trees
I Portion of ledger containing only checking and expense Introduction: The
Invention of B-Tree
AVL Trees
accounts. Paged Binary Tree
Multilevel Indexing:
I Journal file contains the monthly transactions that are B-Tree Indexes
An Object Oriented
ultimately to be posted to the ledger file. Entries in Representation of
B-Tree

journal file are paired. B-Tree Methods:


Search, Insert and
Others
I Posting involves associating each transaction with its B-Tree Nomenclature
Formal Definition of
account in the ledger. B-Tree Properties
Worst-Case Search
Depth
I Posting implemented - uses the account number as a Deletion, Merging and
Redistribution
key to relate the journal transactions to the ledger Redistribution
Redistribution During
records. Insertion: A Way to
Module III
Application of the Model to a General Ledger
Mr. Balaji N
Problem
An Object
The Problem Oriented Model for
Implementing
Cosequential Files
I Solution - building an index for the ledger so we can Matching Names in
Two Lists
work through the journal transactions using account Merging Two Lists
Application of the
Model to a General
number in each journal entry to look up the correct Ledger Problem
Extension of the
ledger record. Model to Include
Multiway Merging

I This solution involves seeking back and forth across the Multilevel Indexing
and B-Trees
ledger files as we work through the journal. Introduction: The
Invention of B-Tree
AVL Trees
I Better solution - begin by collecting all the journal Paged Binary Tree
Multilevel Indexing:
transaction that relate to a given account. B-Tree Indexes
An Object Oriented
Representation of
I This gives sorting the journal transactions by account B-Tree
B-Tree Methods:
number, producing a list ordered. Search, Insert and
Others
B-Tree Nomenclature
I Create output list by using ledger and sorted journal Formal Definition of
B-Tree Properties
cosequentially - process the two lists sequentially and in Worst-Case Search
Depth
Deletion, Merging and
parallel. Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Extension of the Model to Include Multiway
Mr. Balaji N
Merging
An Object
Oriented Model for
I Synchronizing - two way merge of two list of names. Implementing
Cosequential Files
Process of deciding which of two input items has the Matching Names in
Two Lists
minimum value, outputting that item, then moving Merging Two Lists
Application of the
ahead in the list from which that item is taken. Model to a General
Ledger Problem
Extension of the
I Keep an array of lists and array of the items or keys Model to Include
Multiway Merging

that are being used from each list in the cosequential Multilevel Indexing
and B-Trees
process: Introduction: The
Invention of B-Tree
list[0], list[1], list[2], . . . list[k-1] AVL Trees
Paged Binary Tree
item[0], item[1], item[2], . . . item[k-1] Multilevel Indexing:
B-Tree Indexes
An Object Oriented
I MinIndex - function to find the index of item with the Representation of
B-Tree
minimum collating sequence value and an inner loop B-Tree Methods:
Search, Insert and
Others
that finds all lists that are using that item: B-Tree Nomenclature
Formal Definition of
B-Tree Properties
I Finding the minimum and testing to see in which lists Worst-Case Search
Depth
the item occurs and which files therefore need to be Deletion, Merging and
Redistribution
read. Redistribution
Redistribution During
Insertion: A Way to
Module III
A Selection Tree for Merging Large Number of
Mr. Balaji N
Lists
An Object
Oriented Model for
Implementing
I Begin merging a larger number of lists, the set of Cosequential Files
sequential comparisons to find the key with minimum Matching Names in
Two Lists
Merging Two Lists
value becomes noticeably expensive. Application of the
Model to a General
Ledger Problem
I If there is a need to merge considerably more than eight Extension of the
Model to Include
lists, we could replace the loop of comparisons with a Multiway Merging

Multilevel Indexing
selection tree. and B-Trees
Introduction: The
I Selection tree - classic time-versus-space trade-off. Invention of B-Tree
AVL Trees
reduce the time required to find the key with lowest Paged Binary Tree
Multilevel Indexing:
value by using data structure to save information about B-Tree Indexes
An Object Oriented
Representation of
the relative key values across cycles of the procedure’s B-Tree
B-Tree Methods:
main loop. Search, Insert and
Others
B-Tree Nomenclature
I It is a kind of tournament tree in which each higher Formal Definition of
B-Tree Properties
level node represent the winner of the comparison Worst-Case Search
Depth
Deletion, Merging and
between the two descendent keys. Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
A Selection Tree for Merging Large Number of
Mr. Balaji N
Lists
An Object
Contd..., Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
I Minimum value is always at the root node of the tree, Model to Include
Multiway Merging

each key has an associated reference to the list from Multilevel Indexing
and B-Trees
which it came - Binary Tree. Introduction: The
Invention of B-Tree
AVL Trees
I Number of comparisons required to establish a new Paged Binary Tree
Multilevel Indexing:
tournament winner is, related to depth of the binary B-Tree Indexes
An Object Oriented
tree. Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Introduction: The Invention of B-Tree
Mr. Balaji N

An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
I Douglas Comer, survey article, The Ubiquitous Merging Two Lists
Application of the
B-Tree (1979). Model to a General
Ledger Problem
Extension of the
I Discovery of a general method for storing and retrieving Model to Include
Multiway Merging

data in large file systems that would provide rapid Multilevel Indexing
and B-Trees
access to the data with minimal overhead cost. Introduction: The
Invention of B-Tree
I R. Bayer and E. McCreight, published an article, AVL Trees
Paged Binary Tree

Organization and Maintenance of Large Ordered Multilevel Indexing:


B-Tree Indexes
An Object Oriented
Indexes (1972). Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Statement of the Problem
Mr. Balaji N

An Object
1. Searching the index must be faster than binary Oriented Model for
searching: Implementing
Cosequential Files
I Searching for a key on a disk often involves seeking to Matching Names in
Two Lists
different disk tracks. Merging Two Lists
Application of the
I Seeks are expensive for large files and it needs more Model to a General
Ledger Problem

time to extract the result. Extension of the


Model to Include
Multiway Merging
I Binary Search - average of about 9.5 seeks is required
Multilevel Indexing
to find a key in an index of 1000 items. and B-Trees
Introduction: The
2. Insertion and deletion must be as fast as search: Invention of B-Tree
AVL Trees
I Inserting a key into an index involves moving a large Paged Binary Tree
Multilevel Indexing:
number of the other keys in the index. B-Tree Indexes
An Object Oriented
I Need to find a way to make insertions and deletions Representation of
B-Tree
that have only local effects in the index rather than B-Tree Methods:
Search, Insert and
Others
requiring massive reorganization. B-Tree Nomenclature
Formal Definition of
B-Tree Properties
These two were the critical problems that confronted Bayer Worst-Case Search
Depth
and McCreight in 1970. Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Indexing with Binary Search Trees
Mr. Balaji N

I Looking at the cost of keeping a list in sorted order so An Object


Oriented Model for
we can perform binary searches. Implementing
Cosequential Files
I Sorted list of keys: AX CL DE FB FT HN JD KF Matching Names in
Two Lists
NR PA RF SD TK WS YJ Merging Two Lists
Application of the
Model to a General
I Construction of Binary Search Tree (BST) is very easy. Ledger Problem
Extension of the
Model to Include
I Using elementary data structure techniques, it is simple Multiway Merging

Multilevel Indexing
to create nodes that contain right and left link fields so and B-Trees
the binary search tree can be constructed as a linked Introduction: The
Invention of B-Tree
AVL Trees
structure. Paged Binary Tree
Multilevel Indexing:
I BST is not fast enough for disk resident indexing and B-Tree Indexes
An Object Oriented
Representation of
lack of an effective strategy of balancing the tree. B-Tree
B-Tree Methods:
Search, Insert and
I Solution - AVL Trees and Paged Binary Trees. Others
B-Tree Nomenclature
I No longer have to sort the file to perform a binary Formal Definition of
B-Tree Properties
Worst-Case Search
search and it is illustrated as below in table appear in Depth
Deletion, Merging and
random rather than sorted order. Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Indexing with Binary Search Trees
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
I The sequence of files has no relation to the structure of Matching Names in
Two Lists
the tree; all the information about the logical structure Merging Two Lists
Application of the
is carried in the link fields. Model to a General
Ledger Problem
Extension of the
I If we add a new key to the file, we need only link it to Model to Include
Multiway Merging
the appropriate leaf node to create a tree that provides Multilevel Indexing
and B-Trees
search performance that is as good as we would get Introduction: The
Invention of B-Tree
with a binary search on a sorted list. AVL Trees
Paged Binary Tree
I Search performance on this tree is still good because Multilevel Indexing:
B-Tree Indexes

the tree is in a balanced state. An Object Oriented


Representation of
B-Tree
I Balanced - height of the shortest path to a leaf B-Tree Methods:
Search, Insert and
Others
does not differ from the height of the longest path B-Tree Nomenclature
Formal Definition of
by more than one level. B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Indexing with Binary Search Trees
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
I Following keys to the tree in the sequence in which they Cosequential Files

appear: NP MB TM LA UF ND TS NK. Matching Names in


Two Lists
Merging Two Lists
I Searching down through the tree and adding each key Application of the
Model to a General
Ledger Problem
at its correct position in the search tree results in the Extension of the
Model to Include
tree imbalance. Multiway Merging

Multilevel Indexing
I Trees that are built by placing the keys into the tree as and B-Trees
Introduction: The
they occur without rearrangement. Invention of B-Tree
AVL Trees
Paged Binary Tree
I A binary search on a sorted list of these 24 keys requires Multilevel Indexing:
B-Tree Indexes
only 5 seeks in the worst case. An Object Oriented
Representation of
B-Tree
I If each node is treated as a fixed-length record in which B-Tree Methods:
Search, Insert and
the link fields contain relative record number (RRNs) Others
B-Tree Nomenclature

pointing to other nodes, then it is possible to place such Formal Definition of


B-Tree Properties
Worst-Case Search
a tree structure on secondary storage. Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Indexing with Binary Search Trees
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
I Below table illustrates the contents of the 15 records Model to a General
Ledger Problem
Extension of the
that would be required to form the binary tree. Model to Include
Multiway Merging
I Note that more than half of the link fields in the file are Multilevel Indexing
and B-Trees
empty because they are leaf nodes which no children. Introduction: The
Invention of B-Tree
I -1, indicates the leaf node is empty, the search through AVL Trees
Paged Binary Tree
the tree has reached the leaf level and that there are no Multilevel Indexing:
B-Tree Indexes
An Object Oriented
more nodes on the search path. Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Indexing with Binary Search Trees
Mr. Balaji N
Contd...,
An Object
Key Left child Right child Oriented Model for
Implementing
0 FB 10 8 Cosequential Files
1 JD Matching Names in
Two Lists
Merging Two Lists
2 RF Application of the
Model to a General
3 SD 6 13 Ledger Problem
Extension of the
Model to Include
4 AX Multiway Merging

5 YJ Multilevel Indexing
and B-Trees
6 PA 11 2 Introduction: The
Invention of B-Tree
AVL Trees
7 FT Paged Binary Tree
Multilevel Indexing:
8 HN 7 1 B-Tree Indexes
An Object Oriented
9 KF 0 3 Representation of
B-Tree
B-Tree Methods:
10 CL 4 12 Search, Insert and
Others
11 NR B-Tree Nomenclature
Formal Definition of
B-Tree Properties
12 DE Worst-Case Search
Depth
13 WS 14 5 Deletion, Merging and
Redistribution
14 TK Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
AVL Trees
Mr. Balaji N

I Degenerate Tree - reorganize the nodes of the tree as An Object


Oriented Model for
we receive new keys, maintaining a near optimal tree Implementing
Cosequential Files
structure. Matching Names in
Two Lists
I Elegant method for handling reorganization results in a Merging Two Lists
Application of the
Model to a General
class of tree knows as AVL trees. Ledger Problem
Extension of the
Model to Include
I AVL tree, in the honor of Russian mathematicians, G. Multiway Merging

M. Adel’son-Vel’skii and E. M. Landis. Multilevel Indexing


and B-Trees
I AVL tree is a height-balanced tree, means that there is Introduction: The
Invention of B-Tree
AVL Trees
a limit placed on the amount of difference allowed Paged Binary Tree
Multilevel Indexing:
between the heights of any two sub-trees sharing a B-Tree Indexes
An Object Oriented
common root. Representation of
B-Tree
B-Tree Methods:
I In AVL tree the maximum allowable difference is one. Search, Insert and
Others
B-Tree Nomenclature
I AVL tree - height-balanced 1 tree or HB(1) tree. Formal Definition of
B-Tree Properties
Worst-Case Search
I HB(k) tree - which are permitted to be k-levels out of Depth
Deletion, Merging and
balance. Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
AVL Tree
Mr. Balaji N
Construction
An Object
Oriented Model for
I The sub-trees of every node differ in height by at most Implementing
Cosequential Files
one. Matching Names in
Two Lists
I Every sub-tree is an AVL tree. Merging Two Lists
Application of the
Model to a General
I Balance requirement for an AVL tree: the left and right Ledger Problem
Extension of the
sub-trees differ by at most 1 in height. Model to Include
Multiway Merging

Multilevel Indexing
I Each left sub-tree has a height 1 greater than each right and B-Trees
sub-tree. Introduction: The
Invention of B-Tree
AVL Trees
I They rely on adding an extra attribute, the balance Paged Binary Tree
Multilevel Indexing:
factor to each node. B-Tree Indexes
An Object Oriented
Representation of
I This factor indicates whether the tree is left-heavy (the B-Tree
B-Tree Methods:
Search, Insert and
height of the left sub-tree is 1 greater than the right Others
B-Tree Nomenclature
sub-tree), balanced (both sub-trees are the same Formal Definition of
B-Tree Properties
height) or right-heavy (the height of the right sub-tree Worst-Case Search
Depth

is 1 greater than the left sub-tree). Deletion, Merging and


Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
AVL Tree
Mr. Balaji N
Construction
An Object
I If the balance would be destroyed by an insertion, a Oriented Model for
Implementing
rotation is performed to correct the balance. Cosequential Files
I Trees which remain balanced - and thus guarantee Matching Names in
Two Lists
Merging Two Lists
O(logn) search times - in a dynamic environment. Or Application of the
Model to a General
more importantly, since any tree can be re-balanced - Ledger Problem
Extension of the
Model to Include
but at considerable cost - can be re-balanced in O(logn) Multiway Merging

time. Multilevel Indexing


and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
AVL Tree
Mr. Balaji N
Construction
An Object
Oriented Model for
Implementing
Cosequential Files
I AVL tree checks the height of the left and the right Matching Names in
Two Lists
sub-trees and assures that the difference is not more Merging Two Lists
Application of the
Model to a General
than 1. This difference is called the Balance Factor. Ledger Problem
Extension of the
Model to Include
I Here we see that the first tree is balanced and the next Multiway Merging

two trees are not balanced (Figure 1). Multilevel Indexing


and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
BalanceFactor = height(left−subtree)−height(right−subtree) B-Tree Indexes
An Object Oriented
(1) Representation of
B-Tree
B-Tree Methods:
If the difference in the height of left and right sub-trees is Search, Insert and
Others
more than 1, the tree is balanced using some rotation B-Tree Nomenclature
Formal Definition of
techniques. B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
AVL Rotations
Mr. Balaji N

An Object
Oriented Model for
Implementing
Cosequential Files
To balance itself, an AVL tree may perform the following Matching Names in
Two Lists
Merging Two Lists
four kinds of rotations, Application of the
Model to a General
Ledger Problem
1. Left Rotation, Extension of the
Model to Include
Multiway Merging
2. Right Rotation,
Multilevel Indexing
3. Left-Right Rotation, and and B-Trees
Introduction: The
Invention of B-Tree
4. Right-Left Rotation. AVL Trees
Paged Binary Tree
The first two rotations are single rotations and the next two Multilevel Indexing:
B-Tree Indexes
An Object Oriented
rotations are double rotations. To have an unbalanced tree, Representation of
B-Tree
we at least need a tree of height 2. B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
AVL Tree
Mr. Balaji N
Features
An Object
1. By setting a maximum allowable difference in the height Oriented Model for
Implementing
of any two subtrees, AVL tree guarantee a minimum Cosequential Files

level of performance in searching; and Matching Names in


Two Lists
Merging Two Lists
2. Maintaining a tree in AVL form as new nodes are Application of the
Model to a General
Ledger Problem
inserted involves the use of one of a set of four possible Extension of the
Model to Include
rotations. Each rotation is confined to a single, local Multiway Merging

Multilevel Indexing
area of the tree. and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
AVL Tree
Mr. Balaji N
Features
An Object
1. By setting a maximum allowable difference in the height Oriented Model for
Implementing
of any two subtrees, AVL tree guarantee a minimum Cosequential Files

level of performance in searching; and Matching Names in


Two Lists
Merging Two Lists
2. Maintaining a tree in AVL form as new nodes are Application of the
Model to a General
Ledger Problem
inserted involves the use of one of a set of four possible Extension of the
Model to Include
rotations. Each rotation is confined to a single, local Multiway Merging

Multilevel Indexing
area of the tree. and B-Trees
Introduction: The
Invention of B-Tree
I AVL tree are not themselves applicable to most file AVL Trees
Paged Binary Tree
structure problem because, like all strictly binary trees, Multilevel Indexing:
B-Tree Indexes
they have too many levels and they are too deep. An Object Oriented
Representation of
B-Tree
I AVL tree guarantees that search performance B-Tree Methods:
Search, Insert and
Others
approximates that of a complete balanced tree. B-Tree Nomenclature
Formal Definition of
I Complete balanced tree, the worst case search to find a B-Tree Properties
Worst-Case Search
Depth
key, given N possible keys is log2 (N + 1). Deletion, Merging and
Redistribution
I AVL tree - 1.44log2 (N + 1). Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Paged Binary Tree
Mr. Balaji N

I Dividing the binary tree into pages and the storing each An Object
Oriented Model for
page in a block of contiguous locations on disk. Implementing
Cosequential Files
I It reduces the number of seeks associated with any Matching Names in
Two Lists
search. Merging Two Lists
Application of the
Model to a General
I It has the potential to result in faster searching on Ledger Problem
Extension of the
secondary storage. Model to Include
Multiway Merging

Multilevel Indexing
I A typical page can hold 8KB of size, capable of holding and B-Trees
511 key/reference field pairs. Introduction: The
Invention of B-Tree
AVL Trees
I Each page contains a completely balanced full binary Paged Binary Tree
Multilevel Indexing:
tree and that the pages are organized as a completely B-Tree Indexes
An Object Oriented
Representation of
balanced full tree B-Tree
B-Tree Methods:
Search, Insert and
I Number of seeks required for a worst-case search of a Others
B-Tree Nomenclature
complete full balanced binary tree is log2 (N + 1). Formal Definition of
B-Tree Properties
Worst-Case Search
I Paged version of a completely full balanced tree is Depth
Deletion, Merging and
logk+1 (N + 1) Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Paged Binary Tree
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging

Multilevel Indexing
and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Figure: Paged Binary Tree Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Paged Binary Tree
Mr. Balaji N
Problems
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging
I Inefficient Disk Usage. Multilevel Indexing
and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Paged Binary Tree
Mr. Balaji N
Problems
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging
I Inefficient Disk Usage. Multilevel Indexing
and B-Trees
I Implementation and building of Page Tree. Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Paged Binary Tree
Mr. Balaji N
Problems
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging
I Inefficient Disk Usage. Multilevel Indexing
and B-Trees
I Implementation and building of Page Tree. Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
B-Trees
Mr. Balaji N

An Object
I B-trees are multilevel indexes that solve the problem of Oriented Model for
Implementing
linear cost of insertion and deletion. Cosequential Files
Matching Names in
I Each node of a B-tree is an index record, and each of Two Lists
Merging Two Lists

these records has the same maximum number of Application of the


Model to a General
Ledger Problem
key-reference pairs (order) of the B-tree. Extension of the
Model to Include
Multiway Merging
I Insertion - update the index record. If the new key is Multilevel Indexing
the new largest key in the index record, it is the new and B-Trees
Introduction: The
higher-level key of that record, and the next higher level Invention of B-Tree
AVL Trees
key of the index must be updated. Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
I Insertion into index causes it to be overfull - split into An Object Oriented
Representation of
B-Tree
two records, each with half of the keys. B-Tree Methods:
Search, Insert and
Others
I Promotion of the key - New index node has been B-Tree Nomenclature
Formal Definition of
created at this level, the largest key in this new node B-Tree Properties
Worst-Case Search
must be inserted into the next higher level node. Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Creating a B-Tree
Mr. Balaji N
I We use order 4 B-tree, means maximum 4 key-reference
An Object
pairs per node. Oriented Model for
Implementing
I Small node size has the advantage of causing pages to Cosequential Files
Matching Names in
split more frequently. Two Lists
Merging Two Lists
I The sequence is: C S D T A M P I B W N G U R K Application of the
Model to a General
Ledger Problem
EHOLJYQZFXV Extension of the
Model to Include
I The tree starts with a single empty record, the first 4 Multiway Merging

Multilevel Indexing
keys are inserted into the record. and B-Trees
Introduction: The
I For the fifth key A is added, the original node is split Invention of B-Tree
AVL Trees
and the tree grows by one level as a new root is created. Paged Binary Tree
Multilevel Indexing:
I The keys in the root are the largest key in the left leaf B-Tree Indexes
An Object Oriented
Representation of
D, and the largest key in the right leaf T. B-Tree
B-Tree Methods:
I The keys M, P and I belongs in the rightmost leaf Search, Insert and
Others
B-Tree Nomenclature
node, since they are larger than the largest key in the Formal Definition of
B-Tree Properties
right node. Worst-Case Search
Depth
I The insertion of I, causes the splitting, since rightmost Deletion, Merging and
Redistribution
Redistribution
leaf node is overfull. Redistribution During
Insertion: A Way to
Module III
Creating a B-Tree
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
I The largest key in the new node, P is inserted into the Cosequential Files

root Matching Names in


Two Lists
Merging Two Lists
I Same process has been continued for the keys B, W, Application of the
Model to a General
Ledger Problem
N, G and U are inserted. Extension of the
Model to Include
Multiway Merging
I The next key in the list R, should be put into the Multilevel Indexing
rightmost leaf node, since it is greater than the largest and B-Trees
Introduction: The
key in the previous node P and lesser than equal to the Invention of B-Tree
AVL Trees
largest key node W. Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
I Insertion of R causes the rightmost leaf node to split, An Object Oriented
Representation of
insertion into the root to split and the tree grows to B-Tree
B-Tree Methods:
Search, Insert and
level three. Others
B-Tree Nomenclature
I Insertion of K, E, H, O, L, Y, J, Q and Z continue Formal Definition of
B-Tree Properties
Worst-Case Search
with another node split. Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Creating a B-Tree
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
I The next key in the sequence F, requires splitting the Application of the
Model to a General
Ledger Problem
second leaf node. Extension of the
Model to Include
Multiway Merging
I Insertion of X and V causes rightmost leaf to be
Multilevel Indexing
overfull and split. and B-Trees
Introduction: The
I The rightmost leaf of the middle level is also overfull Invention of B-Tree
AVL Trees

and is split. Paged Binary Tree


Multilevel Indexing:
B-Tree Indexes
I All 26 letters are inserted into a tree of height three and An Object Oriented
Representation of
B-Tree
order four. B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Representing B-Tree Nodes in Memory
Mr. Balaji N

An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
I A class to represent the memory resident B-tree nodes. Merging Two Lists
Application of the
Model to a General
I Class BTreeNode is a template class and has methods Ledger Problem
Extension of the
to insert and remove a key and to split and merge Model to Include
Multiway Merging

nodes. Multilevel Indexing


and B-Trees
I Protected members that store the file address of the Introduction: The
Invention of B-Tree
node and the minimum and maximum number of keys. AVL Trees
Paged Binary Tree
Multilevel Indexing:
I Every data member of a BTreeNode has stored when B-Tree Indexes
An Object Oriented
the object is not in memory. Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Representing B-Tree Nodes in Memory
Mr. Balaji N
Contd...,
template <class keyType> An Object
Oriented Model for
class BTreeNode: public SimpleIndex <keyType> // this is Implementing
Cosequential Files
in memory version of the BTreeNode Matching Names in
Two Lists
{ public: Merging Two Lists
Application of the
BTreeNode (int maxKeys, int unique = 1); Model to a General
Ledger Problem
Extension of the
int Insert (const keyType key, int recAddr); Model to Include
Multiway Merging
int Remove (const keyType key, int recAddr = -1); Multilevel Indexing
and B-Trees
LargestKey (); // returns value of Largest key Introduction: The
Invention of B-Tree
Split (BTreeNode <keyType> *newNode); // move into AVL Trees
Paged Binary Tree
newNode Multilevel Indexing:
B-Tree Indexes
Pack (IOBuffer & buffer) const; An Object Oriented
Representation of
Unpack (IOBuffer& buffer); B-Tree
B-Tree Methods:
Search, Insert and
protected: Others
B-Tree Nomenclature
int MaxBKeys; // maximum number of keys in a node Formal Definition of
B-Tree Properties
Init(); Worst-Case Search
Depth
Deletion, Merging and
friend Class BTree <keyType>; Redistribution
Redistribution
}; Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Search
Mr. Balaji N
Characteristics
It is a tree-searching procedure. An Object
Oriented Model for
Implementing
I They are iterative, and Cosequential Files
Matching Names in
I They work in two stages, operating alternatively on Two Lists
Merging Two Lists
entire pages and then within pages. Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging

Multilevel Indexing
and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Search
Mr. Balaji N
Characteristics
It is a tree-searching procedure. An Object
Oriented Model for
Implementing
I They are iterative, and Cosequential Files
Matching Names in
I They work in two stages, operating alternatively on Two Lists
Merging Two Lists
entire pages and then within pages. Application of the
Model to a General
Ledger Problem
Extension of the
I Loading the page into memory and then searching Model to Include
Multiway Merging

through the page, looking for the key at successively Multilevel Indexing
and B-Trees
lower levels of the tree until it reaches the leaf level. Introduction: The
Invention of B-Tree
I Search for L - recAddr = btree.Search(’L’); AVL Trees
Paged Binary Tree
Multilevel Indexing:
I Method Search calls method FindLeaf , which searches B-Tree Indexes
An Object Oriented
down a branch of the tree, beginning at the root, which Representation of
B-Tree
B-Tree Methods:
is referenced by the pointer value Nodes[0]. Search, Insert and
Others
B-Tree Nomenclature
I In the first iteration, with level=1, the line - Formal Definition of
B-Tree Properties
recAddr = Nodes[level-1] → Search(key, -1, 0); is an Worst-Case Search
Depth
inexact search and finds L is less than P, first key in the Deletion, Merging and
Redistribution

record. Redistribution
Redistribution During
Insertion: A Way to
Module III
Search
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
I The line Nodes[level] = Fetch(recAddr); reads that Matching Names in
Two Lists
second-level node into a new BTreeNode object and Merging Two Lists
Application of the
makes Nodes[1] point to this new object. Model to a General
Ledger Problem
Extension of the
I The second iteration, with level=2, searches for L is less Model to Include
Multiway Merging
than M, the second key in the record, the second Multilevel Indexing
and B-Trees
reference is selected, and the second node in the leaf Introduction: The
Invention of B-Tree
level of the tree is loaded into Nodes[2]. AVL Trees
Paged Binary Tree
I After for loop increments level, the iteration stops, and Multilevel Indexing:
B-Tree Indexes

FineLeaf returns the address of this leaf node. An Object Oriented


Representation of
B-Tree
I After FindLeaf returns, method Search uses an exact B-Tree Methods:
Search, Insert and
Others
search of the leaf node to find that there is no data B-Tree Nomenclature
Formal Definition of
record that has key L and the value returns -1. B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Insertion
Mr. Balaji N

An Object
Important observations are splitting and promotion process: Oriented Model for
Implementing
1. It begins with a search that proceeds all the way down Cosequential Files
Matching Names in
to the leaf level; and Two Lists
Merging Two Lists
Application of the
2. After finding the insertion location at the leaf level, the Model to a General
Ledger Problem

work of insertion, overflow detection, and splitting Extension of the


Model to Include
Multiway Merging
proceeds upward from the bottom. Multilevel Indexing
and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Insertion
Mr. Balaji N

An Object
Important observations are splitting and promotion process: Oriented Model for
Implementing
1. It begins with a search that proceeds all the way down Cosequential Files
Matching Names in
to the leaf level; and Two Lists
Merging Two Lists
Application of the
2. After finding the insertion location at the leaf level, the Model to a General
Ledger Problem

work of insertion, overflow detection, and splitting Extension of the


Model to Include
Multiway Merging
proceeds upward from the bottom. Multilevel Indexing
and B-Trees
Iterative process having three phases: Introduction: The
Invention of B-Tree
1. Search to the leaf level, using method FindLeaf before AVL Trees
Paged Binary Tree
the iteration; Multilevel Indexing:
B-Tree Indexes
An Object Oriented
2. Insertion, overflow detection and splitting on the Representation of
B-Tree
B-Tree Methods:
upward path; Search, Insert and
Others
B-Tree Nomenclature
3. Creation of a new root node, if the current root was Formal Definition of
B-Tree Properties
split. Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Insertion
Mr. Balaji N
Contd...,
An Object
I Inserting R and its data record address (recAddr) into Oriented Model for
the tree. Implementing
Cosequential Files
I The first operation in method Insert is to search to the Matching Names in
Two Lists

root for key R using FindLeaf: thisNode = FindLeaf Merging Two Lists
Application of the
Model to a General
(key); Ledger Problem
Extension of the
I FindLeaf loads a complete branch into memory. Model to Include
Multiway Merging
I The next step is to insert R into the leaf node Multilevel Indexing
and B-Trees
result = thisnode → Insert (key, recAddr); Introduction: The
Invention of B-Tree
I The result here is that an overflow is detected. AVL Trees
Paged Binary Tree
I The object thisNode now has five keys. The node must Multilevel Indexing:
B-Tree Indexes
be split into two nodes, using the following code: An Object Oriented
Representation of
B-Tree
newNode = NewNode(); B-Tree Methods:
Search, Insert and
thisNode rightarrow Split (newNode); Others
B-Tree Nomenclature
Store (thisNode); Formal Definition of
B-Tree Properties
Worst-Case Search
Store (newNode); Depth
Deletion, Merging and
I Now the two nodes, one with keys R, S and T, and one Redistribution
Redistribution
with U and W, have been stored back in the file. Redistribution During
Insertion: A Way to
Module III
Insertion
Mr. Balaji N
Contd...,
An Object
Oriented Model for
I The next step is to update the parent node, since the Implementing
Cosequential Files
largest key in thisNode has changed, method Matching Names in
Two Lists
UpdateKey is used to record the change. Merging Two Lists
Application of the
parentNode → UpdateKey (largestKey, thisNode → Model to a General
Ledger Problem
Extension of the
LargestKey); Model to Include
Multiway Merging

I Hence the value W in the root is changed to T, then Multilevel Indexing


and B-Trees
the largest value in the new node is inserted into the Introduction: The
Invention of B-Tree
root if the tree: AVL Trees
Paged Binary Tree
parentNode → Insert (newNode → LargestKey(), Multilevel Indexing:
B-Tree Indexes

newNode → RecAddr); An Object Oriented


Representation of
B-Tree
B-Tree Methods:
I The value W is inserted into the root - Promoting the Search, Insert and
Others
key W. This causes the root to overflow with five keys. B-Tree Nomenclature
Formal Definition of
B-Tree Properties
I The node is split, resulting in a node with keys D, M Worst-Case Search
Depth
and P and one with T and W. Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Insertion
Mr. Balaji N
Contd...,
An Object
I New root node is created, and the keys P and W are Oriented Model for
Implementing
inserted into it. Refer the below code. Cosequential Files
Matching Names in
I It begins by appending the old root node into the Two Lists
Merging Two Lists

B-tree file. Application of the


Model to a General
Ledger Problem
I Insert supports functions like BTreeNode :: Split. Extension of the
Model to Include
Multiway Merging

Multilevel Indexing
and B-Trees
Introduction: The
int newAddr = BTreeFile.Append(Root); // put previous Invention of B-Tree
AVL Trees
root into file Paged Binary Tree
Multilevel Indexing:
// insert 2 keys in new root node B-Tree Indexes
An Object Oriented
Representation of
Root.Keys[0] = thisNode → LargestKey(); B-Tree
B-Tree Methods:
Root.RecAddrs[0] = newAddr; Search, Insert and
Others

Root.Keys[1] = newNode → LargestKey(); B-Tree Nomenclature


Formal Definition of
B-Tree Properties
Root.RecAddrs[1] = newNode → RecAddr; Worst-Case Search
Depth
Root.NumKeys = 2; Deletion, Merging and
Redistribution

Height++: Redistribution
Redistribution During
Insertion: A Way to
Module III
Insertion
Mr. Balaji N
Contd...,
template <class keyType> An Object
Oriented Model for
int BTreeNode <keyType>::Split (BTreeNode <keyType> * Implementing
Cosequential Files
newNode) Matching Names in
Two Lists
{ Merging Two Lists
Application of the
// find the first key to be moved into the new node Model to a General
Ledger Problem
Extension of the
int midpt = (NumKeys+1) / 2; Model to Include
Multiway Merging
int numNewKeys = NumKeys - midpt; Multilevel Indexing
and B-Trees
// move the keys and readdrs from this to newNode Introduction: The
Invention of B-Tree
for(int i = midpt; i<NumKeys; i++) AVL Trees
Paged Binary Tree
{ Multilevel Indexing:
B-Tree Indexes
newNode → Keys[i-midpt] = Keys[i]; An Object Oriented
Representation of
newNode → RecAddrs[i-midpt] = RecAddrs[i]; B-Tree
B-Tree Methods:
Search, Insert and
} Others
B-Tree Nomenclature
// set number of keys in the two nodes Formal Definition of
B-Tree Properties
newNode → NumKeys = numNewKeys; Worst-Case Search
Depth
Deletion, Merging and
NumKeys = midpt; Redistribution
Redistribution
retutn 1; } Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
B-Tree Nomenclature
Mr. Balaji N

An Object
Oriented Model for
Implementing
Cosequential Files
I Order of a B-tree as the minimum number of keys that Matching Names in
Two Lists
can be in a page of a tree. Merging Two Lists
Application of the
Model to a General
I Order of B-Tree to be the maximum number of Ledger Problem
Extension of the
descendants that a page can have. Model to Include
Multiway Merging

Multilevel Indexing
I When you split the page of a B-tree, the descendants and B-Trees
are divided as evenly as possible between the new page Introduction: The
Invention of B-Tree
AVL Trees
and the old page. Paged Binary Tree
Multilevel Indexing:
I Every page except the root and the leaves has at least B-Tree Indexes
An Object Oriented
Representation of
m/2 descendants. B-Tree
B-Tree Methods:
Search, Insert and
I Term that is used differently by different authors is leaf. Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Formal Definition of B-Tree Properties
Mr. Balaji N

An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Properties of a B-Tree of order m: Two Lists
Merging Two Lists
Application of the
I Every page has a maximum of m descendants. Model to a General
Ledger Problem
Extension of the
I Every page, except for the root and the leaves has at Model to Include
Multiway Merging
least m/2 descendants. Multilevel Indexing
and B-Trees
I The root has at least two descendants. Introduction: The
Invention of B-Tree
AVL Trees
I All the leaves appear on the same level. Paged Binary Tree
Multilevel Indexing:
I The leaf level forms a complete, ordered index of the B-Tree Indexes
An Object Oriented
Representation of
associated data file. B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Worst-Case Search Depth
Mr. Balaji N
I The relationship between the page size and B-tree, the
An Object
number of keys to be stored in the tree and the number Oriented Model for
Implementing
of levels that the tree can extend. Cosequential Files
Matching Names in
I Ex: 10000000 keys using a B-tree of order 512 keys Two Lists
Merging Two Lists
(maximum 511 keys per page). Application of the
Model to a General
Ledger Problem
We need to calculate the maximum height of a tree Extension of the
Model to Include
with 10000000 keys in the leaves. Multiway Merging

Multilevel Indexing
I We are interested in worst-case depth of the tree, every and B-Trees
Introduction: The
page of the tree has only the minimum number of Invention of B-Tree
AVL Trees
descendants. Paged Binary Tree
Multilevel Indexing:
I In such a case the keys are spread over a maximal B-Tree Indexes
An Object Oriented
Representation of
height for the tree and a minimal breadth. B-Tree
B-Tree Methods:
I For a B-tree of order m, the minimum number of Search, Insert and
Others
B-Tree Nomenclature
descendants from the root page is 2, so the second level Formal Definition of
B-Tree Properties
of the tree contains only 2 pages. Worst-Case Search
Depth
 
I Each page of these in turn has at least m/2 Deletion, Merging and
Redistribution
Redistribution
descendants. Redistribution During
Insertion: A Way to
Module III
Worst-Case Search Depth
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Level Minimum number of Descendants Cosequential Files
Matching Names in
1 (root) 2  Two Lists
Merging Two Lists
2 2 * m/2 
Application of the
Model to a General
Ledger Problem
3 2 * m/2 * m/2 Extension of the
Model to Include
... ... Multiway Merging
 d−1 Multilevel Indexing
d 2 * m/2 and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
I In general, for any level d, of a B-tree, the minimum Multilevel Indexing:
B-Tree Indexes
number of descendants extending from that level is 2 * An Object Oriented
Representation of
 d−1 B-Tree
m/2 B-Tree Methods:
Search, Insert and
Others
I For any tree with N keys in its leaves, we can express B-Tree Nomenclature
Formal Definition of
the relationship between keys and the minimum height B-Tree Properties
 d−1 Worst-Case Search
d is N ≥ 2 ∗ m/2 Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Worst-Case Search Depth
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
I To solve d, gives the upper bound for the depth of a Model to a General
Ledger Problem
B-tree with N keys Extension of the
Model to Include
Multiway Merging
  Multilevel Indexing
d ≤ 1 + log m/2 (N/2) (2) and B-Trees
Introduction: The
Invention of B-Tree
I d ≤ 1 + log256 500000 or d ≤ 3.37 AVL Trees
Paged Binary Tree
Multilevel Indexing:
I So we can say that given 1000000 keys, a B-tree of B-Tree Indexes
An Object Oriented
order 512 has a depth of no more than three (3) levels. Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Deletion, Merging and Redistribution
Mr. Balaji N

An Object
Oriented Model for
Rules that states the following: Implementing
Cosequential Files
Matching Names in
I Every page except for the root and the leaves has at Two Lists
Merging Two Lists
least dm/2e descendants. Application of the
Model to a General
Ledger Problem
I A page contains at least dm/2e keys and do more than Extension of the
Model to Include
m keys. Multiway Merging

Multilevel Indexing
and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Deletion, Merging and Redistribution
Mr. Balaji N

An Object
Oriented Model for
Rules that states the following: Implementing
Cosequential Files
Matching Names in
I Every page except for the root and the leaves has at Two Lists
Merging Two Lists
least dm/2e descendants. Application of the
Model to a General
Ledger Problem
I A page contains at least dm/2e keys and do more than Extension of the
Model to Include
m keys. Multiway Merging

Multilevel Indexing
and B-Trees
I The process of page splitting guarantees that the Introduction: The
Invention of B-Tree
properties are maintained when new Keys are inserted AVL Trees
Paged Binary Tree
into the tree. Multilevel Indexing:
B-Tree Indexes
An Object Oriented
I We need to develop some kind of equally reliable Representation of
B-Tree

guarantee that these properties are maintained when B-Tree Methods:


Search, Insert and
Others
keys are deleted from the tree. B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Deletion, Merging and Redistribution
Mr. Balaji N
Example for Deletion
An Object
I Try to delete some of the keys from previous Oriented Model for
construction step. Implementing
Cosequential Files
I Delete C from the first leaf node does not cause an Matching Names in
Two Lists

underflow in the node and does not change its largest Merging Two Lists
Application of the
Model to a General
value. Ledger Problem
Extension of the
I Deletion of P from second leaf node does not cause Model to Include
Multiway Merging

underflow, but it does change the largest key in the Multilevel Indexing
and B-Trees
node. Introduction: The
Invention of B-Tree
I Hence, second level node must be modified to reflect AVL Trees
Paged Binary Tree
the change. Multilevel Indexing:
B-Tree Indexes
I The key to the second level node must be modified so An Object Oriented
Representation of
B-Tree
that it contains O instead of P. B-Tree Methods:
Search, Insert and
I Deletion of H causes an underflow in the third leaf Others
B-Tree Nomenclature
node. Formal Definition of
B-Tree Properties
Worst-Case Search
I After H deleted, the last remaining key in the node I is Depth
Deletion, Merging and
inserted into the neighbor node and the third leaf node Redistribution
Redistribution
is deleted. Redistribution During
Insertion: A Way to
Module III
Deletion, Merging and Redistribution
Mr. Balaji N
Example for Deletion
An Object
Oriented Model for
Implementing
Cosequential Files
I The second leaf node has only three keys, there is a Matching Names in
Two Lists
room for the key I in that node - general merge Merging Two Lists
Application of the
Model to a General
operation. Ledger Problem
Extension of the
Model to Include
I After the merge, the second level node is modified to Multiway Merging

reflect the current status of the leaf nodes. Multilevel Indexing


and B-Trees
I Merging and other operations can propagate to the root Introduction: The
Invention of B-Tree
AVL Trees
of the B-tree. Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
I If the root ends up with only one key and one child, it An Object Oriented
Representation of
can be eliminated. B-Tree
B-Tree Methods:
Search, Insert and
I Its sole child node becomes the new root of the tree Others
B-Tree Nomenclature
and the tree gets shorter by one level. Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Rules for Deleting a key K
Mr. Balaji N

An Object
Oriented Model for
1. If n has more than minimum number of keys and the k Implementing
Cosequential Files
is not the largest in n, simply delete k from n. Matching Names in
Two Lists

2. If n has more than the minimum number of keys and Merging Two Lists
Application of the
Model to a General
the k is the largest in n, delete k and modify the higher Ledger Problem
Extension of the
level indexes to reflect the largest key in n. Model to Include
Multiway Merging

3. if n has exactly the minimum number of keys and one Multilevel Indexing
and B-Trees
of the siblings of n has few enough keys, merge n with Introduction: The
Invention of B-Tree

its siblings and delete a key from the parent node. AVL Trees
Paged Binary Tree
Multilevel Indexing:
4. If n has exactly the minimum number of keys and one B-Tree Indexes
An Object Oriented
of the siblings of n has extra keys, redistribute by Representation of
B-Tree
B-Tree Methods:
moving some keys from a sibling to n, and modify the Search, Insert and
Others
higher level indexes to reflect the new largest keys in B-Tree Nomenclature
Formal Definition of
B-Tree Properties
the affected nodes. Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Redistribution
Mr. Balaji N

An Object
Oriented Model for
Implementing
Cosequential Files
I Redistribution differs from both splitting and merging Matching Names in
Two Lists
in that it never causes the collection of nodes in the Merging Two Lists
Application of the
Model to a General
tree to change. Ledger Problem
Extension of the
Model to Include
I Siblings implies that the pages have the same parent Multiway Merging

page. Multilevel Indexing


and B-Trees
I If there are two nodes at the leaf level that are logically Introduction: The
Invention of B-Tree
AVL Trees
adjacent but do not have the same parent. Paged Binary Tree
Multilevel Indexing:
I Redistribution algorithm are generally written so they B-Tree Indexes
An Object Oriented
Representation of
do not consider moving keys between nodes that are B-Tree
B-Tree Methods:
not siblings, even when they are logically adjacent. Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
Redistribution During Insertion: A Way to
Mr. Balaji N
Improve Storage Utilization
An Object
Advantages Oriented Model for
Implementing
I It is a way of avoiding or at least postponing, the Cosequential Files
Matching Names in
creation of new pages. Two Lists
Merging Two Lists
I Rather than splitting a full page and creating two Application of the
Model to a General
Ledger Problem
approximately half-full pages, redistribution lets us place Extension of the
Model to Include
some of the overflowing keys into another page. Multiway Merging

Multilevel Indexing
I It tends to make a B-tree more efficient in its utilization and B-Trees
of space. Introduction: The
Invention of B-Tree
AVL Trees
I Efficient space utilization by viewing the amount of Paged Binary Tree
Multilevel Indexing:
space used to store information as a percentage of the B-Tree Indexes
An Object Oriented
total amount of space required to hold the B-Tree. Representation of
B-Tree
I Space utilization in a B-Tree using two-way splitting is B-Tree Methods:
Search, Insert and
Others
around 50% in worst case scenario. B-Tree Nomenclature
Formal Definition of
I The idea of using redistribution as an alternative to B-Tree Properties
Worst-Case Search
Depth
splitting when possible, splitting a page only when both Deletion, Merging and
Redistribution
of its splitting are full. (Refer Bayer and Redistribution
Redistribution During
McCreight’s original paper 1972) Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging

AVL Trees Multilevel Indexing


and B-Trees
Paged Binary Tree Introduction: The
Invention of B-Tree
Multilevel Indexing: B-Tree Indexes AVL Trees
Paged Binary Tree
An Object Oriented Representation of B-Tree Multilevel Indexing:
B-Tree Indexes
An Object Oriented
B-Tree Methods: Search, Insert and Others Representation of
B-Tree
B-Tree Nomenclature B-Tree Methods:
Search, Insert and
Others
Formal Definition of B-Tree Properties B-Tree Nomenclature
Formal Definition of
Worst-Case Search Depth B-Tree Properties
Worst-Case Search
Deletion, Merging and Redistribution Depth
Deletion, Merging and
Redistribution Redistribution
Redistribution
Redistribution During Insertion: A Way to Improve Redistribution During
Insertion: A Way to
Module III
B* Trees
Mr. Balaji N

An Object
Oriented Model for
Implementing
I Work on B-trees in 1976, Knuth (1998) extends the Cosequential Files
Matching Names in
notion of redistribution during insertion to include new Two Lists
Merging Two Lists
rules for splitting - fundamental B-tree form a B* Tree. Application of the
Model to a General
Ledger Problem
I Consider the system in which we are postponing Extension of the
Model to Include
Multiway Merging
splitting through redistribution. Multilevel Indexing
and B-Trees
I If we are considering any page other than the root, we Introduction: The
Invention of B-Tree
know that when it is finally split. AVL Trees
Paged Binary Tree
I Time to two-to-three split → page has at least one Multilevel Indexing:
B-Tree Indexes

sibling that is also full. An Object Oriented


Representation of
B-Tree
I This results in pages that are each about two-third full B-Tree Methods:
Search, Insert and
Others
rather than just half full - B* tree. B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
B* Trees
Mr. Balaji N
Properties
An Object
Oriented Model for
Implementing
Cosequential Files
1. Every page has a maximum of m descendants. Matching Names in
Two Lists
2. Every page except for the root has at least Merging Two Lists
Application of the
d(2m − 1)/3e descendants. Model to a General
Ledger Problem
Extension of the
3. The root has at least two descendants (unless it is a Model to Include
Multiway Merging

leaf). Multilevel Indexing


and B-Trees
4. All the leaves appear on the same level. Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
B* Trees
Mr. Balaji N
Properties
An Object
Oriented Model for
Implementing
Cosequential Files
1. Every page has a maximum of m descendants. Matching Names in
Two Lists
2. Every page except for the root has at least Merging Two Lists
Application of the
d(2m − 1)/3e descendants. Model to a General
Ledger Problem
Extension of the
3. The root has at least two descendants (unless it is a Model to Include
Multiway Merging

leaf). Multilevel Indexing


and B-Trees
4. All the leaves appear on the same level. Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
I If there are no sibling, no two-to-three split is possible. Multilevel Indexing:
B-Tree Indexes
An Object Oriented
I When it does split, it can produce two pages that are Representation of
B-Tree
B-Tree Methods:
each about two-thirds full. Search, Insert and
Others
B-Tree Nomenclature
I This has the advantage of ensuring that all pages below Formal Definition of
B-Tree Properties
root level adhere to B* tree characteristics. Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to

You might also like