Professional Documents
Culture Documents
Cosequential Processing and The Sorting of Large Files: Mr. Balaji N
Cosequential Processing and The Sorting of Large Files: Mr. Balaji N
Mr. Balaji N
An Object
Oriented Model for
Implementing
Cosequential Files
Module III Matching Names in
Two Lists
Merging Two Lists
Cosequential Processing and the Sorting of Large Files Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging
An Object
I Names common to the two lists - match operation or Oriented Model for
Implementing
an intersection. Cosequential Files
Matching Names in
I Not allow duplicate names within a list and that list are Two Lists
Merging Two Lists
sorted in ascending order. Application of the
Model to a General
Ledger Problem
I Start by reading in the initial item from each list, and Extension of the
Model to Include
Multiway Merging
we find that they match.
Multilevel Indexing
I Match set or intersection - output this first item as a and B-Trees
Introduction: The
member. Invention of B-Tree
AVL Trees
Paged Binary Tree
I We then read in the next item from each list. This time Multilevel Indexing:
B-Tree Indexes
the item in List 2 is less than the item in List 1. An Object Oriented
Representation of
B-Tree
I Ex: match the item CARTER from List 1 and scan B-Tree Methods:
Search, Insert and
Others
down List 2 until we either find it or jump beyond it, B-Tree Nomenclature
Formal Definition of
and continue the process. B-Tree Properties
Worst-Case Search
Depth
I Eventually we come to the end of one of the lists. Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Matching Names in Two Lists
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
I Initializing - we need to arrange things in such a way Matching Names in
Two Lists
that the procedure gets going properly. Merging Two Lists
Application of the
I Getting and accessing the next list item - we need Model to a General
Ledger Problem
Extension of the
simple methods that support getting the next list Model to Include
Multiway Merging
element and accessing it. Multilevel Indexing
and B-Trees
I Synchronizing - we have to make sure that the current Introduction: The
Invention of B-Tree
item from one list is never so far ahead of the current AVL Trees
Paged Binary Tree
item on the other list that a match will be missed. Multilevel Indexing:
B-Tree Indexes
An Object Oriented
I Handling end-of-file conditions - when we get to the end Representation of
B-Tree
{ int MoreItems;
An Object
InitializeList (1, List1Name); // initialize List 1 Oriented Model for
Implementing
InitializeList (2, List2Name); // initialize List 2 Cosequential Files
Matching Names in
InitializeOutput (OutputListName); Two Lists
Merging Two Lists
MoreItems = NextItemInList(1) && NextItemInList(2); Application of the
Model to a General
while (MoreItems) Ledger Problem
Extension of the
Model to Include
{ if (Item(1)<Item(2)) Multiway Merging
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
I The tree-way-step, single-loop model for cosequential Model to a General
Ledger Problem
Extension of the
processing can easily be modified to handle merging of Model to Include
Multiway Merging
lists. Multilevel Indexing
and B-Trees
I Difference between matching and merging is that with Introduction: The
Invention of B-Tree
merging we must read completely through each of the AVL Trees
Paged Binary Tree
lists. Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
Multilevel Indexing
accounts. and B-Trees
I Portion of ledger containing only checking and expense Introduction: The
Invention of B-Tree
AVL Trees
accounts. Paged Binary Tree
Multilevel Indexing:
I Journal file contains the monthly transactions that are B-Tree Indexes
An Object Oriented
ultimately to be posted to the ledger file. Entries in Representation of
B-Tree
I This solution involves seeking back and forth across the Multilevel Indexing
and B-Trees
ledger files as we work through the journal. Introduction: The
Invention of B-Tree
AVL Trees
I Better solution - begin by collecting all the journal Paged Binary Tree
Multilevel Indexing:
transaction that relate to a given account. B-Tree Indexes
An Object Oriented
Representation of
I This gives sorting the journal transactions by account B-Tree
B-Tree Methods:
number, producing a list ordered. Search, Insert and
Others
B-Tree Nomenclature
I Create output list by using ledger and sorted journal Formal Definition of
B-Tree Properties
cosequentially - process the two lists sequentially and in Worst-Case Search
Depth
Deletion, Merging and
parallel. Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
that are being used from each list in the cosequential Multilevel Indexing
and B-Trees
process: Introduction: The
Invention of B-Tree
list[0], list[1], list[2], . . . list[k-1] AVL Trees
Paged Binary Tree
item[0], item[1], item[2], . . . item[k-1] Multilevel Indexing:
B-Tree Indexes
An Object Oriented
I MinIndex - function to find the index of item with the Representation of
B-Tree
minimum collating sequence value and an inner loop B-Tree Methods:
Search, Insert and
Others
that finds all lists that are using that item: B-Tree Nomenclature
Formal Definition of
B-Tree Properties
I Finding the minimum and testing to see in which lists Worst-Case Search
Depth
the item occurs and which files therefore need to be Deletion, Merging and
Redistribution
read. Redistribution
Redistribution During
Insertion: A Way to
Module III
A Selection Tree for Merging Large Number of
Mr. Balaji N
Lists
An Object
Oriented Model for
Implementing
I Begin merging a larger number of lists, the set of Cosequential Files
sequential comparisons to find the key with minimum Matching Names in
Two Lists
Merging Two Lists
value becomes noticeably expensive. Application of the
Model to a General
Ledger Problem
I If there is a need to merge considerably more than eight Extension of the
Model to Include
lists, we could replace the loop of comparisons with a Multiway Merging
Multilevel Indexing
selection tree. and B-Trees
Introduction: The
I Selection tree - classic time-versus-space trade-off. Invention of B-Tree
AVL Trees
reduce the time required to find the key with lowest Paged Binary Tree
Multilevel Indexing:
value by using data structure to save information about B-Tree Indexes
An Object Oriented
Representation of
the relative key values across cycles of the procedure’s B-Tree
B-Tree Methods:
main loop. Search, Insert and
Others
B-Tree Nomenclature
I It is a kind of tournament tree in which each higher Formal Definition of
B-Tree Properties
level node represent the winner of the comparison Worst-Case Search
Depth
Deletion, Merging and
between the two descendent keys. Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
A Selection Tree for Merging Large Number of
Mr. Balaji N
Lists
An Object
Contd..., Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
I Minimum value is always at the root node of the tree, Model to Include
Multiway Merging
each key has an associated reference to the list from Multilevel Indexing
and B-Trees
which it came - Binary Tree. Introduction: The
Invention of B-Tree
AVL Trees
I Number of comparisons required to establish a new Paged Binary Tree
Multilevel Indexing:
tournament winner is, related to depth of the binary B-Tree Indexes
An Object Oriented
tree. Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
I Douglas Comer, survey article, The Ubiquitous Merging Two Lists
Application of the
B-Tree (1979). Model to a General
Ledger Problem
Extension of the
I Discovery of a general method for storing and retrieving Model to Include
Multiway Merging
data in large file systems that would provide rapid Multilevel Indexing
and B-Trees
access to the data with minimal overhead cost. Introduction: The
Invention of B-Tree
I R. Bayer and E. McCreight, published an article, AVL Trees
Paged Binary Tree
An Object
1. Searching the index must be faster than binary Oriented Model for
searching: Implementing
Cosequential Files
I Searching for a key on a disk often involves seeking to Matching Names in
Two Lists
different disk tracks. Merging Two Lists
Application of the
I Seeks are expensive for large files and it needs more Model to a General
Ledger Problem
Multilevel Indexing
to create nodes that contain right and left link fields so and B-Trees
the binary search tree can be constructed as a linked Introduction: The
Invention of B-Tree
AVL Trees
structure. Paged Binary Tree
Multilevel Indexing:
I BST is not fast enough for disk resident indexing and B-Tree Indexes
An Object Oriented
Representation of
lack of an effective strategy of balancing the tree. B-Tree
B-Tree Methods:
Search, Insert and
I Solution - AVL Trees and Paged Binary Trees. Others
B-Tree Nomenclature
I No longer have to sort the file to perform a binary Formal Definition of
B-Tree Properties
Worst-Case Search
search and it is illustrated as below in table appear in Depth
Deletion, Merging and
random rather than sorted order. Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Indexing with Binary Search Trees
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
I The sequence of files has no relation to the structure of Matching Names in
Two Lists
the tree; all the information about the logical structure Merging Two Lists
Application of the
is carried in the link fields. Model to a General
Ledger Problem
Extension of the
I If we add a new key to the file, we need only link it to Model to Include
Multiway Merging
the appropriate leaf node to create a tree that provides Multilevel Indexing
and B-Trees
search performance that is as good as we would get Introduction: The
Invention of B-Tree
with a binary search on a sorted list. AVL Trees
Paged Binary Tree
I Search performance on this tree is still good because Multilevel Indexing:
B-Tree Indexes
Multilevel Indexing
I Trees that are built by placing the keys into the tree as and B-Trees
Introduction: The
they occur without rearrangement. Invention of B-Tree
AVL Trees
Paged Binary Tree
I A binary search on a sorted list of these 24 keys requires Multilevel Indexing:
B-Tree Indexes
only 5 seeks in the worst case. An Object Oriented
Representation of
B-Tree
I If each node is treated as a fixed-length record in which B-Tree Methods:
Search, Insert and
the link fields contain relative record number (RRNs) Others
B-Tree Nomenclature
5 YJ Multilevel Indexing
and B-Trees
6 PA 11 2 Introduction: The
Invention of B-Tree
AVL Trees
7 FT Paged Binary Tree
Multilevel Indexing:
8 HN 7 1 B-Tree Indexes
An Object Oriented
9 KF 0 3 Representation of
B-Tree
B-Tree Methods:
10 CL 4 12 Search, Insert and
Others
11 NR B-Tree Nomenclature
Formal Definition of
B-Tree Properties
12 DE Worst-Case Search
Depth
13 WS 14 5 Deletion, Merging and
Redistribution
14 TK Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
Multilevel Indexing
I Each left sub-tree has a height 1 greater than each right and B-Trees
sub-tree. Introduction: The
Invention of B-Tree
AVL Trees
I They rely on adding an extra attribute, the balance Paged Binary Tree
Multilevel Indexing:
factor to each node. B-Tree Indexes
An Object Oriented
Representation of
I This factor indicates whether the tree is left-heavy (the B-Tree
B-Tree Methods:
Search, Insert and
height of the left sub-tree is 1 greater than the right Others
B-Tree Nomenclature
sub-tree), balanced (both sub-trees are the same Formal Definition of
B-Tree Properties
height) or right-heavy (the height of the right sub-tree Worst-Case Search
Depth
An Object
Oriented Model for
Implementing
Cosequential Files
To balance itself, an AVL tree may perform the following Matching Names in
Two Lists
Merging Two Lists
four kinds of rotations, Application of the
Model to a General
Ledger Problem
1. Left Rotation, Extension of the
Model to Include
Multiway Merging
2. Right Rotation,
Multilevel Indexing
3. Left-Right Rotation, and and B-Trees
Introduction: The
Invention of B-Tree
4. Right-Left Rotation. AVL Trees
Paged Binary Tree
The first two rotations are single rotations and the next two Multilevel Indexing:
B-Tree Indexes
An Object Oriented
rotations are double rotations. To have an unbalanced tree, Representation of
B-Tree
we at least need a tree of height 2. B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
AVL Tree
Mr. Balaji N
Features
An Object
1. By setting a maximum allowable difference in the height Oriented Model for
Implementing
of any two subtrees, AVL tree guarantee a minimum Cosequential Files
Multilevel Indexing
area of the tree. and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
AVL Tree
Mr. Balaji N
Features
An Object
1. By setting a maximum allowable difference in the height Oriented Model for
Implementing
of any two subtrees, AVL tree guarantee a minimum Cosequential Files
Multilevel Indexing
area of the tree. and B-Trees
Introduction: The
Invention of B-Tree
I AVL tree are not themselves applicable to most file AVL Trees
Paged Binary Tree
structure problem because, like all strictly binary trees, Multilevel Indexing:
B-Tree Indexes
they have too many levels and they are too deep. An Object Oriented
Representation of
B-Tree
I AVL tree guarantees that search performance B-Tree Methods:
Search, Insert and
Others
approximates that of a complete balanced tree. B-Tree Nomenclature
Formal Definition of
I Complete balanced tree, the worst case search to find a B-Tree Properties
Worst-Case Search
Depth
key, given N possible keys is log2 (N + 1). Deletion, Merging and
Redistribution
I AVL tree - 1.44log2 (N + 1). Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
I Dividing the binary tree into pages and the storing each An Object
Oriented Model for
page in a block of contiguous locations on disk. Implementing
Cosequential Files
I It reduces the number of seeks associated with any Matching Names in
Two Lists
search. Merging Two Lists
Application of the
Model to a General
I It has the potential to result in faster searching on Ledger Problem
Extension of the
secondary storage. Model to Include
Multiway Merging
Multilevel Indexing
I A typical page can hold 8KB of size, capable of holding and B-Trees
511 key/reference field pairs. Introduction: The
Invention of B-Tree
AVL Trees
I Each page contains a completely balanced full binary Paged Binary Tree
Multilevel Indexing:
tree and that the pages are organized as a completely B-Tree Indexes
An Object Oriented
Representation of
balanced full tree B-Tree
B-Tree Methods:
Search, Insert and
I Number of seeks required for a worst-case search of a Others
B-Tree Nomenclature
complete full balanced binary tree is log2 (N + 1). Formal Definition of
B-Tree Properties
Worst-Case Search
I Paged version of a completely full balanced tree is Depth
Deletion, Merging and
logk+1 (N + 1) Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Paged Binary Tree
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging
Multilevel Indexing
and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Figure: Paged Binary Tree Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Paged Binary Tree
Mr. Balaji N
Problems
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging
I Inefficient Disk Usage. Multilevel Indexing
and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Paged Binary Tree
Mr. Balaji N
Problems
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging
I Inefficient Disk Usage. Multilevel Indexing
and B-Trees
I Implementation and building of Page Tree. Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Paged Binary Tree
Mr. Balaji N
Problems
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
Model to a General
Ledger Problem
Extension of the
Model to Include
Multiway Merging
I Inefficient Disk Usage. Multilevel Indexing
and B-Trees
I Implementation and building of Page Tree. Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
An Object
I B-trees are multilevel indexes that solve the problem of Oriented Model for
Implementing
linear cost of insertion and deletion. Cosequential Files
Matching Names in
I Each node of a B-tree is an index record, and each of Two Lists
Merging Two Lists
Multilevel Indexing
keys are inserted into the record. and B-Trees
Introduction: The
I For the fifth key A is added, the original node is split Invention of B-Tree
AVL Trees
and the tree grows by one level as a new root is created. Paged Binary Tree
Multilevel Indexing:
I The keys in the root are the largest key in the left leaf B-Tree Indexes
An Object Oriented
Representation of
D, and the largest key in the right leaf T. B-Tree
B-Tree Methods:
I The keys M, P and I belongs in the rightmost leaf Search, Insert and
Others
B-Tree Nomenclature
node, since they are larger than the largest key in the Formal Definition of
B-Tree Properties
right node. Worst-Case Search
Depth
I The insertion of I, causes the splitting, since rightmost Deletion, Merging and
Redistribution
Redistribution
leaf node is overfull. Redistribution During
Insertion: A Way to
Module III
Creating a B-Tree
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
I The largest key in the new node, P is inserted into the Cosequential Files
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
I A class to represent the memory resident B-tree nodes. Merging Two Lists
Application of the
Model to a General
I Class BTreeNode is a template class and has methods Ledger Problem
Extension of the
to insert and remove a key and to split and merge Model to Include
Multiway Merging
Multilevel Indexing
and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Search
Mr. Balaji N
Characteristics
It is a tree-searching procedure. An Object
Oriented Model for
Implementing
I They are iterative, and Cosequential Files
Matching Names in
I They work in two stages, operating alternatively on Two Lists
Merging Two Lists
entire pages and then within pages. Application of the
Model to a General
Ledger Problem
Extension of the
I Loading the page into memory and then searching Model to Include
Multiway Merging
through the page, looking for the key at successively Multilevel Indexing
and B-Trees
lower levels of the tree until it reaches the leaf level. Introduction: The
Invention of B-Tree
I Search for L - recAddr = btree.Search(’L’); AVL Trees
Paged Binary Tree
Multilevel Indexing:
I Method Search calls method FindLeaf , which searches B-Tree Indexes
An Object Oriented
down a branch of the tree, beginning at the root, which Representation of
B-Tree
B-Tree Methods:
is referenced by the pointer value Nodes[0]. Search, Insert and
Others
B-Tree Nomenclature
I In the first iteration, with level=1, the line - Formal Definition of
B-Tree Properties
recAddr = Nodes[level-1] → Search(key, -1, 0); is an Worst-Case Search
Depth
inexact search and finds L is less than P, first key in the Deletion, Merging and
Redistribution
record. Redistribution
Redistribution During
Insertion: A Way to
Module III
Search
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
I The line Nodes[level] = Fetch(recAddr); reads that Matching Names in
Two Lists
second-level node into a new BTreeNode object and Merging Two Lists
Application of the
makes Nodes[1] point to this new object. Model to a General
Ledger Problem
Extension of the
I The second iteration, with level=2, searches for L is less Model to Include
Multiway Merging
than M, the second key in the record, the second Multilevel Indexing
and B-Trees
reference is selected, and the second node in the leaf Introduction: The
Invention of B-Tree
level of the tree is loaded into Nodes[2]. AVL Trees
Paged Binary Tree
I After for loop increments level, the iteration stops, and Multilevel Indexing:
B-Tree Indexes
An Object
Important observations are splitting and promotion process: Oriented Model for
Implementing
1. It begins with a search that proceeds all the way down Cosequential Files
Matching Names in
to the leaf level; and Two Lists
Merging Two Lists
Application of the
2. After finding the insertion location at the leaf level, the Model to a General
Ledger Problem
An Object
Important observations are splitting and promotion process: Oriented Model for
Implementing
1. It begins with a search that proceeds all the way down Cosequential Files
Matching Names in
to the leaf level; and Two Lists
Merging Two Lists
Application of the
2. After finding the insertion location at the leaf level, the Model to a General
Ledger Problem
root for key R using FindLeaf: thisNode = FindLeaf Merging Two Lists
Application of the
Model to a General
(key); Ledger Problem
Extension of the
I FindLeaf loads a complete branch into memory. Model to Include
Multiway Merging
I The next step is to insert R into the leaf node Multilevel Indexing
and B-Trees
result = thisnode → Insert (key, recAddr); Introduction: The
Invention of B-Tree
I The result here is that an overflow is detected. AVL Trees
Paged Binary Tree
I The object thisNode now has five keys. The node must Multilevel Indexing:
B-Tree Indexes
be split into two nodes, using the following code: An Object Oriented
Representation of
B-Tree
newNode = NewNode(); B-Tree Methods:
Search, Insert and
thisNode rightarrow Split (newNode); Others
B-Tree Nomenclature
Store (thisNode); Formal Definition of
B-Tree Properties
Worst-Case Search
Store (newNode); Depth
Deletion, Merging and
I Now the two nodes, one with keys R, S and T, and one Redistribution
Redistribution
with U and W, have been stored back in the file. Redistribution During
Insertion: A Way to
Module III
Insertion
Mr. Balaji N
Contd...,
An Object
Oriented Model for
I The next step is to update the parent node, since the Implementing
Cosequential Files
largest key in thisNode has changed, method Matching Names in
Two Lists
UpdateKey is used to record the change. Merging Two Lists
Application of the
parentNode → UpdateKey (largestKey, thisNode → Model to a General
Ledger Problem
Extension of the
LargestKey); Model to Include
Multiway Merging
Multilevel Indexing
and B-Trees
Introduction: The
int newAddr = BTreeFile.Append(Root); // put previous Invention of B-Tree
AVL Trees
root into file Paged Binary Tree
Multilevel Indexing:
// insert 2 keys in new root node B-Tree Indexes
An Object Oriented
Representation of
Root.Keys[0] = thisNode → LargestKey(); B-Tree
B-Tree Methods:
Root.RecAddrs[0] = newAddr; Search, Insert and
Others
Height++: Redistribution
Redistribution During
Insertion: A Way to
Module III
Insertion
Mr. Balaji N
Contd...,
template <class keyType> An Object
Oriented Model for
int BTreeNode <keyType>::Split (BTreeNode <keyType> * Implementing
Cosequential Files
newNode) Matching Names in
Two Lists
{ Merging Two Lists
Application of the
// find the first key to be moved into the new node Model to a General
Ledger Problem
Extension of the
int midpt = (NumKeys+1) / 2; Model to Include
Multiway Merging
int numNewKeys = NumKeys - midpt; Multilevel Indexing
and B-Trees
// move the keys and readdrs from this to newNode Introduction: The
Invention of B-Tree
for(int i = midpt; i<NumKeys; i++) AVL Trees
Paged Binary Tree
{ Multilevel Indexing:
B-Tree Indexes
newNode → Keys[i-midpt] = Keys[i]; An Object Oriented
Representation of
newNode → RecAddrs[i-midpt] = RecAddrs[i]; B-Tree
B-Tree Methods:
Search, Insert and
} Others
B-Tree Nomenclature
// set number of keys in the two nodes Formal Definition of
B-Tree Properties
newNode → NumKeys = numNewKeys; Worst-Case Search
Depth
Deletion, Merging and
NumKeys = midpt; Redistribution
Redistribution
retutn 1; } Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
An Object
Oriented Model for
Implementing
Cosequential Files
I Order of a B-tree as the minimum number of keys that Matching Names in
Two Lists
can be in a page of a tree. Merging Two Lists
Application of the
Model to a General
I Order of B-Tree to be the maximum number of Ledger Problem
Extension of the
descendants that a page can have. Model to Include
Multiway Merging
Multilevel Indexing
I When you split the page of a B-tree, the descendants and B-Trees
are divided as evenly as possible between the new page Introduction: The
Invention of B-Tree
AVL Trees
and the old page. Paged Binary Tree
Multilevel Indexing:
I Every page except the root and the leaves has at least B-Tree Indexes
An Object Oriented
Representation of
m/2 descendants. B-Tree
B-Tree Methods:
Search, Insert and
I Term that is used differently by different authors is leaf. Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Properties of a B-Tree of order m: Two Lists
Merging Two Lists
Application of the
I Every page has a maximum of m descendants. Model to a General
Ledger Problem
Extension of the
I Every page, except for the root and the leaves has at Model to Include
Multiway Merging
least m/2 descendants. Multilevel Indexing
and B-Trees
I The root has at least two descendants. Introduction: The
Invention of B-Tree
AVL Trees
I All the leaves appear on the same level. Paged Binary Tree
Multilevel Indexing:
I The leaf level forms a complete, ordered index of the B-Tree Indexes
An Object Oriented
Representation of
associated data file. B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
Multilevel Indexing
I We are interested in worst-case depth of the tree, every and B-Trees
Introduction: The
page of the tree has only the minimum number of Invention of B-Tree
AVL Trees
descendants. Paged Binary Tree
Multilevel Indexing:
I In such a case the keys are spread over a maximal B-Tree Indexes
An Object Oriented
Representation of
height for the tree and a minimal breadth. B-Tree
B-Tree Methods:
I For a B-tree of order m, the minimum number of Search, Insert and
Others
B-Tree Nomenclature
descendants from the root page is 2, so the second level Formal Definition of
B-Tree Properties
of the tree contains only 2 pages. Worst-Case Search
Depth
I Each page of these in turn has at least m/2 Deletion, Merging and
Redistribution
Redistribution
descendants. Redistribution During
Insertion: A Way to
Module III
Worst-Case Search Depth
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Level Minimum number of Descendants Cosequential Files
Matching Names in
1 (root) 2 Two Lists
Merging Two Lists
2 2 * m/2
Application of the
Model to a General
Ledger Problem
3 2 * m/2 * m/2 Extension of the
Model to Include
... ... Multiway Merging
d−1 Multilevel Indexing
d 2 * m/2 and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
I In general, for any level d, of a B-tree, the minimum Multilevel Indexing:
B-Tree Indexes
number of descendants extending from that level is 2 * An Object Oriented
Representation of
d−1 B-Tree
m/2 B-Tree Methods:
Search, Insert and
Others
I For any tree with N keys in its leaves, we can express B-Tree Nomenclature
Formal Definition of
the relationship between keys and the minimum height B-Tree Properties
d−1 Worst-Case Search
d is N ≥ 2 ∗ m/2 Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Worst-Case Search Depth
Mr. Balaji N
Contd...,
An Object
Oriented Model for
Implementing
Cosequential Files
Matching Names in
Two Lists
Merging Two Lists
Application of the
I To solve d, gives the upper bound for the depth of a Model to a General
Ledger Problem
B-tree with N keys Extension of the
Model to Include
Multiway Merging
Multilevel Indexing
d ≤ 1 + log m/2 (N/2) (2) and B-Trees
Introduction: The
Invention of B-Tree
I d ≤ 1 + log256 500000 or d ≤ 3.37 AVL Trees
Paged Binary Tree
Multilevel Indexing:
I So we can say that given 1000000 keys, a B-tree of B-Tree Indexes
An Object Oriented
order 512 has a depth of no more than three (3) levels. Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
An Object
Oriented Model for
Rules that states the following: Implementing
Cosequential Files
Matching Names in
I Every page except for the root and the leaves has at Two Lists
Merging Two Lists
least dm/2e descendants. Application of the
Model to a General
Ledger Problem
I A page contains at least dm/2e keys and do more than Extension of the
Model to Include
m keys. Multiway Merging
Multilevel Indexing
and B-Trees
Introduction: The
Invention of B-Tree
AVL Trees
Paged Binary Tree
Multilevel Indexing:
B-Tree Indexes
An Object Oriented
Representation of
B-Tree
B-Tree Methods:
Search, Insert and
Others
B-Tree Nomenclature
Formal Definition of
B-Tree Properties
Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Deletion, Merging and Redistribution
Mr. Balaji N
An Object
Oriented Model for
Rules that states the following: Implementing
Cosequential Files
Matching Names in
I Every page except for the root and the leaves has at Two Lists
Merging Two Lists
least dm/2e descendants. Application of the
Model to a General
Ledger Problem
I A page contains at least dm/2e keys and do more than Extension of the
Model to Include
m keys. Multiway Merging
Multilevel Indexing
and B-Trees
I The process of page splitting guarantees that the Introduction: The
Invention of B-Tree
properties are maintained when new Keys are inserted AVL Trees
Paged Binary Tree
into the tree. Multilevel Indexing:
B-Tree Indexes
An Object Oriented
I We need to develop some kind of equally reliable Representation of
B-Tree
underflow in the node and does not change its largest Merging Two Lists
Application of the
Model to a General
value. Ledger Problem
Extension of the
I Deletion of P from second leaf node does not cause Model to Include
Multiway Merging
underflow, but it does change the largest key in the Multilevel Indexing
and B-Trees
node. Introduction: The
Invention of B-Tree
I Hence, second level node must be modified to reflect AVL Trees
Paged Binary Tree
the change. Multilevel Indexing:
B-Tree Indexes
I The key to the second level node must be modified so An Object Oriented
Representation of
B-Tree
that it contains O instead of P. B-Tree Methods:
Search, Insert and
I Deletion of H causes an underflow in the third leaf Others
B-Tree Nomenclature
node. Formal Definition of
B-Tree Properties
Worst-Case Search
I After H deleted, the last remaining key in the node I is Depth
Deletion, Merging and
inserted into the neighbor node and the third leaf node Redistribution
Redistribution
is deleted. Redistribution During
Insertion: A Way to
Module III
Deletion, Merging and Redistribution
Mr. Balaji N
Example for Deletion
An Object
Oriented Model for
Implementing
Cosequential Files
I The second leaf node has only three keys, there is a Matching Names in
Two Lists
room for the key I in that node - general merge Merging Two Lists
Application of the
Model to a General
operation. Ledger Problem
Extension of the
Model to Include
I After the merge, the second level node is modified to Multiway Merging
An Object
Oriented Model for
1. If n has more than minimum number of keys and the k Implementing
Cosequential Files
is not the largest in n, simply delete k from n. Matching Names in
Two Lists
2. If n has more than the minimum number of keys and Merging Two Lists
Application of the
Model to a General
the k is the largest in n, delete k and modify the higher Ledger Problem
Extension of the
level indexes to reflect the largest key in n. Model to Include
Multiway Merging
3. if n has exactly the minimum number of keys and one Multilevel Indexing
and B-Trees
of the siblings of n has few enough keys, merge n with Introduction: The
Invention of B-Tree
its siblings and delete a key from the parent node. AVL Trees
Paged Binary Tree
Multilevel Indexing:
4. If n has exactly the minimum number of keys and one B-Tree Indexes
An Object Oriented
of the siblings of n has extra keys, redistribute by Representation of
B-Tree
B-Tree Methods:
moving some keys from a sibling to n, and modify the Search, Insert and
Others
higher level indexes to reflect the new largest keys in B-Tree Nomenclature
Formal Definition of
B-Tree Properties
the affected nodes. Worst-Case Search
Depth
Deletion, Merging and
Redistribution
Redistribution
Redistribution During
Insertion: A Way to
Module III
Redistribution
Mr. Balaji N
An Object
Oriented Model for
Implementing
Cosequential Files
I Redistribution differs from both splitting and merging Matching Names in
Two Lists
in that it never causes the collection of nodes in the Merging Two Lists
Application of the
Model to a General
tree to change. Ledger Problem
Extension of the
Model to Include
I Siblings implies that the pages have the same parent Multiway Merging
Multilevel Indexing
I It tends to make a B-tree more efficient in its utilization and B-Trees
of space. Introduction: The
Invention of B-Tree
AVL Trees
I Efficient space utilization by viewing the amount of Paged Binary Tree
Multilevel Indexing:
space used to store information as a percentage of the B-Tree Indexes
An Object Oriented
total amount of space required to hold the B-Tree. Representation of
B-Tree
I Space utilization in a B-Tree using two-way splitting is B-Tree Methods:
Search, Insert and
Others
around 50% in worst case scenario. B-Tree Nomenclature
Formal Definition of
I The idea of using redistribution as an alternative to B-Tree Properties
Worst-Case Search
Depth
splitting when possible, splitting a page only when both Deletion, Merging and
Redistribution
of its splitting are full. (Refer Bayer and Redistribution
Redistribution During
McCreight’s original paper 1972) Insertion: A Way to
Module III
Outline
An Object Oriented Model for Implementing Cosequential Mr. Balaji N
Files
An Object
Matching Names in Two Lists Oriented Model for
Implementing
Merging Two Lists Cosequential Files
Application of the Model to a General Ledger Problem Matching Names in
Two Lists
Merging Two Lists
Extension of the Model to Include Multiway Merging Application of the
Model to a General
Multilevel Indexing and B-Trees Ledger Problem
Extension of the
Model to Include
Introduction: The Invention of B-Tree Multiway Merging
An Object
Oriented Model for
Implementing
I Work on B-trees in 1976, Knuth (1998) extends the Cosequential Files
Matching Names in
notion of redistribution during insertion to include new Two Lists
Merging Two Lists
rules for splitting - fundamental B-tree form a B* Tree. Application of the
Model to a General
Ledger Problem
I Consider the system in which we are postponing Extension of the
Model to Include
Multiway Merging
splitting through redistribution. Multilevel Indexing
and B-Trees
I If we are considering any page other than the root, we Introduction: The
Invention of B-Tree
know that when it is finally split. AVL Trees
Paged Binary Tree
I Time to two-to-three split → page has at least one Multilevel Indexing:
B-Tree Indexes