You are on page 1of 37

CONTENTS

• I. INTRODUCTION
• II. BINARY SEARCH TREE
• III. INSERT ALGORITHM
• IV. ANALYSIS OF EXAMPLE PROGRAM
• V. EXAMPLE PROGRAM
• VI. REFERENCES

I. INTRODUCTION
Hello; nice to meet you. Welcome to the “Binary Search Tree - C Structure
Tutorial - Part 6.”

II. BINARY SEARCH TREE

A tree is a nonlinear two-dimensional data structure whose nodes contain two

A binary tree is a tree which contains nodes with two pointer links of which
none, one, or both can be null.

A binary search tree is a binary tree whose nodes contain a left pointer link, a
right pointer link, and one or more data elements of which one data element is
a key field. Based on the key data element field, the left sub-tree key field data
element values are less than the parent node key field value and the right sub-
tree key field data element values are greater than or equal to the parent node
key field value. The binary search tree is also a recursive hierarchical data
structure and its shape depends on the order of insertion of the data elements.
The binary search tree can be used for many purposes including inserting,
searching, deleting, sorting and retrieving data.

1. Root
A binary search tree starts with a node called a root which is the topmost node
in the tree. A root pointer points to the root and left and right pointers
recursively point to smaller sub-trees to the bottom and on either side of the
root.

2. Parent, Children and Siblings

Parent describes the root node.

Children describe the nodes to the bottom left or right of a parent. If a node has
only one child, that child must be identified as either the left child or the right
child.
Siblings are children of the same parent.

A parent node can have zero, one or a max of two children nodes. As we follow
down the tree from a parent to a child the child becomes a parent and has
children nodes of its own. However the rule still holds, and the new parent node
can only have zero, one or a max of two child nodes.

3. Ancestor and Descendent

An ancestor of a given node is either the parent, the parent of the parent, the
parent of that parent, etc. For example, in section III below 18 is an ancestor of
33; however, 66 is not an ancestor of 33.

The counterpart of ancestor is descendant. For example, in section III below 33

is a descendent of 18; however, 84 is not a descendent of 18.

4. Balanced Binary Search Tree

The goal, when creating a binary search tree is to have a balanced tree, with
each parent node having the same number of children nodes to its bottom left
and to its bottom right. In other words, each root-to-leaf path should have
exactly the same number of nodes. Remember, you want a binary search tree
and not a linked list where each node only has one bottom branch to one child
node.

5. Internal and External Nodes

Internal nodes are nodes with two children.

External nodes have one or no children. As shown in section III below, an

external node with a terminating branch with no children is represented as a
NULL pointer from a parent node.

6. Leaf
A node that has neither a left child nor a right child is referred to as a leaf node.
In our example in section III below, the leaf nodes are 12, 45, 84 and 33.

III. INSERT ALGORITHM

Given the following randomly selected eight integer numbers: 44, 66, 18, 84,
22, 12, 33, and 45; the data element value in the root node key field would be
the first randomly selected number, i.e., 44.
...........................44
........................./....\

As shown above, the root is the first new node in the binary search tree. The
root has two downward branches or links. The left branch leads to a sub-tree
which only contains key field data element values less than the key field data
element value of the root node. The right sub-tree only contains key field data
element values greater than the key field data element value of the root node.

For example, our next random number is the integer 66. When we compare 66
to 44 we see 66 is greater than 44. Therefore, as shown below, 66 will become
the data element value in a new node to the bottom right of the root node.
...........................44
........................./....\
................................66
............................../.....\

Selecting the next random number, which is the integer 18, and comparing 18
to the root, we see 18 is less than 44. Therefore, as shown below, 18 will
become the data element value in a new node to the bottom left of the root; as
shown below:
............................44
........................./.....\
......................18..........66
.................../.....\...../.....\

Selecting the next random number, which is the integer 84, and comparing 84
to the root 44, we see 84 is greater than 44. However, 66 is already in the node
to the bottom right of the root 44. Therefore, 66 becomes the root. Comparing
84 to the new root 66, we see 84 is greater than 66. Therefore, as shown
below, 84 becomes the data element value in a new node to the bottom right of
the root 66.
............................44
........................./.....\
......................18..........66
.................../.....\...../.....\
........................................84
...................................../.....\

Selecting the next random number, which is the integer 22, and comparing 22
to the root 44, we see 22 is less than 44. However, 18 is already in the node to
the bottom left of the root 44. Therefore, 18 becomes the new root. Comparing
22 to the root 18, we see 22 is greater than 18. Therefore, as shown below, 22
becomes the data element value in a new node to the bottom right of the root
18.
............................44
........................./.....\
......................18..........66
.................../.....\...../.....\
.........................22...........84
......................./.....\....../.....\

Selecting the next random number, which is the integer 12, and comparing 12
to the root 44, we see 12 is less than 44. However, 18 is already in the node to
the bottom left of the root 44. Therefore, 18 becomes the new root. Comparing
12 to the root 18 we see 12 is less than 18. Therefore, as shown below, 12
becomes the data element value in a new node to the bottom left of the root
18.
............................44
........................./.....\
......................18..........66
.................../.....\......./.....\
.................12......22...........84
.............../.....\./.....\....../.....\

Selecting the next random number, which is the integer 33, and comparing 33
to the root 44, we see 33 is less than 44. However, 18 is already in the node to
the bottom left of the root 44. Therefore, 18 becomes the new root. Comparing
33 to the root 18, we see 33 is greater than 18. However, 22 is already in the
node to the bottom right of the root 18. Therefore, 22 becomes the new root.
Comparing 33 to the root 22, we see 33 is greater then 22. Therefore, as shown
below, 33 becomes the data element value in a new node to the bottom right of
the root 22.
.............................44...
........................./..........\...
....................18..................66...
................/........\............./........\
..........12..............22....................84...
......./.....\.........../....\................../...\.
.....null...null....null...33..............null..null.
.............................../..\..............
............................null.null..................

Selecting the last random number, which is the integer 45, and comparing 45 to
the root 44, we see 45 is greater than 44. However, 66 is already in the node to
the bottom right of the root 44. Therefore, 66 becomes the new root.
Comparing 45 to the root 66, we see 45 is less than 66. Therefore, as shown
below, 45 becomes the data element value in a new node to the bottom left of
the root 66.

Given our randomly selected eight numbers: 44, 66, 18, 84, 22, 12, 33, and
45; the completed binary search tree is as follows:
.............................44...
........................./..........\...
....................18..................66...
................/........\............./........\
..........12..............22........45..........84...
......./.....\.........../....\....../..\......../...\.
.....null...null....null...33..null..null..null..null.
.............................../..\..............
............................null.null..................

We end up with a binary search tree of size 8, depth 3, with root 44, and leaves
12, 45, 84 and 33. Notice the binary search tree is almost balanced; the root-
to-leaf path is two for all but leaf 33, which has a root-to-leaf path of three.

IV. ANALYSIS OF EXAMPLE PROGRAM

The example program contains detailed documentation and should be easy to
follow assuming you have a firm foundation in linked lists and pointers.

Instead of using a menu with a boring switch multiple-selection statement and

case labels, I used the following exciting array of three function pointers:

view source

print?

1 void( *ptr2Fcn[3] )( void ) = { exitt, unsorted, bst };

I think function pointers are more fun. The function pointers work as shown
below:

1. void( *ptr2Fcn[0] )( void ) = { exitt };

When the user types integer 0 at the menu prompt, and presses Enter, the
program uses the first function pointer to call the exitt() function and the
program exits.

2. void( *ptr2Fcn[1] )( void ) = { unsorted };

When the user types integer 1 at the menu prompt, and presses Enter, the
program uses the second function pointer to call the unsorted() function and the
program screen prints the unsorted array of preselected random integers.

3. void( *ptr2Fcn[2] )( void ) = { bst };

When the user types integer 2 at the menu prompt, and presses Enter, the
program uses the third function pointer to call the bst() function which inserts
the preselected random integers into the binary search tree; and screen prints
the resulting binary search tree ascending sort of the preselected random
integers.

If you have any questions, post them after the tutorial and I will post replies.

V. EXAMPLE PROGRAM
view source
print?

001 /*

004 PART 6

005 */

007 #include<stdio.h> /* Standard I/O. */

008 #include<stdlib.h> /* Utility Functions. */

009

010 /*

011 DECLARATIONS

012 */

013

014 /*

017 */

019 {

023 };

024

026

" Enter your menu selection of integer 0, 1, or 2 then press enter:

034
"

035

036 /*
037 Declaration of an int array of size 8,

038 initialized with eight random integer values.

039 */

040 int A[ 8 ] = {44, 66, 18, 84, 22, 12, 33, 45};

042

043 /*

045 */

047

049 {

050 char alpha, buff [ 80 ];

051 return fgets( buff, sizeof buff, stdin) && !isspace(*buff ) &&

052
== '\0' );

053 }

054
055 /*

057 */

058

059 /*

061 */

065

066 /*

068 */

070

071 /*

073 */

075

076 /*

079 */

081

083 {

084 /*

085 Call initialize int pointer array function.

086 */

087 initialize();

088

089 /*

090 Initialize array of three pointers to functions,

091 each with no arguments, and each returns void.

092 */

094

096

098 do

099 {

102

104 {

105 /*

107 in the array ptr2Fcn.

108 */
109 ( *ptr2Fcn[ menuSelection ] )();

110

112

113 do

114 {

117 }

118

119 return 0;

120 }

121

122 /*

123 FUNCTION DEFINITIONS

124 */

125

126 /*
127 Node insert function.

128 */

130 {

132 {

134 return;

135 }

136

137 /*

140

143 */

144 if( element->value < ( *subTree )->value )

145 insert( &( *subTree )->left, element );

148

149 return;

150 }

151

152 /*

155 */

157 {

161

162 return;
163 }

164

165 /*

167 */

169 {

170 int i = 0;

172 {

174 };

175

176 return;

177 }

178

180 {
181 exit( 0 );

182

183 return;

184 }

185

186 /*

188 */

189 void unsorted( void )

190 {

printf("\n\n
19
**************************************************************************
1
****\n\n");

192 int u = 0;

printf(" UNSORTED LINEAR ONE-DIMENSIONAL ARRAY OF RANDOM

193
INTEGERS:\n\n ");

printf("\n 0 1 2 3 4 5 6 7 <= Binary search tree

195
insertion sequence.\n");
printf("\n\n
19
**************************************************************************
6
****\n\n");
197

198 return;

199 }

200

201 /*

203 */

205 {

208

210

211 int i = 0;

213 {

214 temp = ( node * )malloc( sizeof( node ));

215 if( temp == NULL )

216 {

218 exit( 1 );

219 }else

220 {

temp->left = NULL; /* Sets the left child of the child

221
node to null. */
temp->right = NULL; /* Sets the right child of the child
222
node to null. */

223

224 /*

226 */

228

229 /*

231 */

232 insert( &root, temp );

233 }

234 }

235

printf("\n\n
23
**************************************************************************
6
****\n\n");

238

239 /*

242 */

243 display( root );

244

printf("\n\n
24
**************************************************************************
5
****\n\n");

246

247 return;

248 }

When traversing a binary tree; or other trees for that matter, there are generally two wyas to lookup/find
items
1) Depth-first traversal

For recursion based searches, I prefer to use depth-first as it is more intutive.

You still haven't said whether your tree structure is a height-balanced tree or not.
If it is then the recursion is trivial, just count the left child nodes

NOTE:!!! I have not tested this. I've only tried to go through it in my head and
a with a few scribbles on paper.

view source

print?

01 int height(node* curNode, int curHeight) {

02

03 if(curNode != NULL) {

04 if(curNode->left != NULL) {

05 curHeight++;

06 height(curNode->left, curHeight);

07 }

08 }

09

10 return curHeight;

11 }

12

13 int main(void) {

14
15 node* root;

17

18 // called like this

19 int h = 0;

20 h = height(root, h);

21 }

On the other hand if the tree isn't balanced, you'll most likely need to search every branch for
the longest one.

view source

print?

01 int height(node *curNode) {

02

03 int curHeight = 0;

04 int curMax = 0;

06

07 }

08

09 int height(node *curNode, int curHeight, int curMax) {

10 if(curNode != NULL) {

12

13 // Traverse the left subtree

14 if(curNode->left != NULL) {

16 }

17

18 // Traverse the right subtree

19 if(curNode->right != NULL) {

20 curMax = height(curNode->right, curHeight + 1, curMax);

21 }

22 }

23

24 return curMax;

25 }

26

27 int main(void) {
28

29 node *root;

30 // Initialize tree

31

32 int h = 0;

33 h = height(root, h);

34 }

01 /**

02 * Driver.cpp

03 */

04

05 #include <iostream>

06 #include "Binary_Tree.h"

07 #include "Binary_Search_Tree.h"

08 #include "pre_order_traversal.h"

09 #include "post_order_traversal.h"

10 #include "in_order_traversal.h"
11 #include "Data.h"

12

13

14 int height(node *curNode, int curHeight, int curMax) {

15 if(curNode != NULL) {

17

18 // Traverse the left subtree

19 if(curNode->left != NULL) {

21 }

22

23 // Traverse the right subtree

24 if(curNode->right != NULL) {

25 curMax = height(curNode->right, curHeight + 1, curMax);

26 }

27 }

28
29 return curMax;

30 }

31
32

33
34

35 int main(){

36

Binary_Search_Tree<Data> the_tree; // creates a binary search tree

37
of type data named the_tree

38

39

41 Data d1=Data(10, 'a');

42 Data d2=Data(20, 'b');

43 Data d3=Data(5, 'c');

44 Data d4=Data(6, 'd');

45 Data d5=Data(2, 'e');

46

47
48 //insert the nodes

49 the_tree.insert(d1);
50 the_tree.insert(d2);

51 the_tree.insert(d3);
52 the_tree.insert(d4);

53 the_tree.insert(d5);

54

55 std::cout<<"****preorder:\n";

56 pre_order_traversal(the_tree, std::cout, 0)

57 std::cout<<"****inorder:\n";

58 in_order_traversal(the_tree, std::cout, 0);

59 std::cout<<"****post order:\n";

60 post_order_traversal(the_tree, std::cout, 0);

61

62 int h = height(the_tree, 0, 0)

63
64

65 return 0;

66
67 }

1 struct node

2 {

3 Data val

5 node *left;

6 node *right;

7 }

01 /**

02 * Driver.cpp

03 *

04 */

05

06 #include <iostream>

07 #include "Binary_Tree.h"

08 #include "Binary_Search_Tree.h"

09 #include "pre_order_traversal.h"
10 #include "post_order_traversal.h"

11 #include "in_order_traversal.h"

12 #include "Data.h"

13

14 template<typename Item_Type>

15 int height(Binary_Tree<Item_Type>& the_tree, int curHeight, int curMax){

16

17 if(!the_tree.is_null()){

18 if(curHeight > curMax){

19 curMax = curHeight;

20 }

21

22 // Traverse the left subtree

23 if(!the_tree.is_null()){

curMax = height(the_tree.get_left_subtree(),
24
curHeight +1, curMax);

25 }

26

27 // Traverse the right subtree

28 if(!the_tree.is_null()){

curMax = height(the_tree.get_right_subtree(),
29
curHeight +1, curMax);

30 }

31 }

32 return curMax;

33 }

34

35
36

37 int main(){

38

39 // create the binary search tree of type data

40 Binary_Search_Tree<Data> the_tree;

41
42

45 Data d2=Data(20, 'b');

46 Data d3=Data(5, 'c');

49
50

51 // insert the nodes into the tree

52 the_tree.insert(d1);

53 the_tree.insert(d2);
54 the_tree.insert(d3);

55 the_tree.insert(d4);
56 the_tree.insert(d5);

57

58 int h = 0;

59 h = height(the_tree, 0 , 0);

60

61 return 0;

62 }
This is kind of a two-part question, the first part would be how to calculate the height of a sub-tree, I know
the definition "The height of a node is the length of the longest downward path to a leaf from that node."
and I understand it, but I fail at implementing it. And to confuse me further this quote can be found on
wikipedia on tree-heights "Conventionally, the value -1 corresponds to a subtree with no nodes, whereas
zero corresponds to a subtree with one node."
And the second part is getting the balance factor of a sub-tree in an AVL tree, I've got no problem
understanding the concept, "get the height of your L and R sub-trees and subtract R from L". And this is
defined as something like this: BALANCE = NODE[L][HEIGHT] - NODE[R][HEIGT]
Reading on wikipedia says this on the first few lines describing insertions into an AVL tree: "If the balance
factor becomes -1, 0, or 1 then the tree is still in AVL form, and no rotations are necessary."
It then goes on, saying this "If the balance factor becomes 2 or -2 then the tree rooted at this node is
unbalanced, and a tree rotation is needed. At most a single or double rotation will be needed to balance
the tree." - which I have no trouble grasping.
But (yes, there's always a but).
Here's where it gets confusing, the text states "If the balance factor of R is 1, it means the insertion
occurred on the (external) right side of that node and a left rotation is needed". But from m understanding
the text said (as I quoted) that if the balance factor was within [-1, 1] then there was no need for
balancing?
I feel I'm so close to grasping the concept, I've gotten the tree rotations down, implemented a normal
binary search tree, and on the brink of grasping AVL-trees but just seem to be missing that essential
epiphany.
Edit: Code examples are preferred over academic formulas as I've always had an easier time grasping
something in code, but any help is greatly appreciated.
Edit: I wish I could mark all questions as "answer", but for me NIck's answer was the first that made me
go "aha".
algorithm data-structures binary-trees avl-tree balancing

thr
3,58712247

74% accept rate

575859
Part 1 - height
up vote 5 down vote accepted

As starblue says, height is just recursive. In pseudo-code:

height(node) = max(height(node.L), height(node.R)) + 1

Now height could be defined in two ways. It could be the number of nodes in the
path from the root to that node, or it could be the number of links. According to
the page you referenced, the most common definition is for the number of links.
In which case the complete pseudo code would be:
height(node):
if node == null:
return -1
else:
max(height(node.L), height(node.R)) + 1

If you wanted the number of nodes the code would be:

height(node):
if node == null:
return 0
else:
max(height(node.L), height(node.R)) + 1

Either way, the rebalancing algorithm I think should work the same.
However, your tree will be much more efficient (O(ln(n))) if you store and update
height information in the tree, rather than calculating it each time. (O(n))

Part 2 - balancing
When it says "If the balance factor of R is 1", it is talking about the balance factor
of the right branch, when the balance factor at the top is 2. It is telling you how to
choose whether to do a single rotation or a double rotation. In (python like)
Pseudo-code:
if balance factor(top) = 2: // right is imbalanced
if balance factor(R) = 1: //
do a left rotation
else if balance factor(R) = -1:
do a double rotation
else: // must be -2, left is imbalanced
if balance factor(L) = 1: //
do a left rotation
else if balance factor(L) = -1:
do a double rotation

I hope this makes sense

575847 • Height is easily implemented by recursion, take the maximum of the height of the
subtrees plus one.

up vote 1 down vote

• The "balance factor of R" refers to the right subtree of the tree which is out of
balance, I suppose.

575884 Well, you can compute the height of a tree with the following recursive function:
int height(struct tree *t) {
up vote 1 down vote
if (t == NULL)
return 0;
else
return max(height(t->left), height(t->right)) + 1;
}

with an appropriate definition of max() and struct tree. You should take the time to
figure out why this corresponds to the definition based on path-length that you quote.
This function uses zero as the height of the empty tree.
However, for something like an AVL tree, I don't think you actually compute the height
each time you need it. Instead, each tree node is augmented with a extra field that
remembers the height of the subtree rooted at that node. This field has to be kept up-to-
date as the tree is modified by insertions and deletions.
I suspect that, if you compute the height each time instead of caching it within the tree
like suggested above, that the AVL tree shape will be correct, but it won't have the
expected logarithmic performance.

575888 Here's where it gets confusing, the text states "If the balance factor of R is 1, it means
the insertion occurred on the (external) right side of that node and a left rotation is

up vote 1 down vote

needed". But from m understanding the text said (as I quoted) that if the balance factor
was within [-1, 1] then there was no need for balancing?
R is the right-hand child of the current node N.

If balance(N) = +2, then you need a rotation of some sort. But which rotation to use?
Well, it depends on balance(R): if balance(R) = +1 then you need a left-rotation on
N; but if balance(R) = -1 then you will need a double-rotation of some sort.

679300 You do not need to calculate tree depths on the fly.

You can maintain them as you perform operations.
up vote 0 down vote Furthermore, you don't actually in fact have to maintain track of depths; you can simply
keep track of the difference between the left and right tree depths.
http://www.eternallyconfuzzled.com/tuts/datastructures/jsw_tut_avl.aspx
Just keeping track of the balance factor (difference between left and right subtrees) is I
found easier from a programming POV, except that sorting out the balance factor after a
rotation is a PITA...

679332 Here's where it gets confusing, the text states "If the balance factor of R is 1, it means
the insertion occurred on the (external) right side of that node and a left rotation is

up vote 0 down vote

needed". But from m understanding the text said (as I quoted) that if the balance factor
was within [-1, 1] then there was no need for balancing?
Okay, epiphany time.
Consider what a rotation does. Let's think about a left rotation.
P = parent
O = ourself (the element we're rotating)
RC = right child
LC = left child (of the right child, not of ourself)

P
\
O
\
RC
/
LC

P
\
RC
/
O
\
LC

10
\
15
\
20
/
18

10
\
20
/
15
\
18

1. our right child moves into our position

2. we become the left child of our right child
3. our right child's left child becomes our right

Now, the big thing you have to notice here - this left rotation HAS NOT CHANGED THE
DEPTH OF THE TREE. We're no more balanced for having done it.
But - and here's the magic in AVL - if we rotated the right child to the right FIRST, what
we'd have is this...
P
\
O
\
LC
\
RC

And NOW if we rotate O left, what we get is this...

P
\
LC
/ \
O RC

Magic! we've managed to get rid of a level of the tree - we've made the tree balance.
Balancing the tree means getting rid of excess depth, and packing the upper levels more
completely - which is exactly what we've just done.
That whole stuff about single/double rotations is simply that you have to have your
subtree looking like this;
P
\
O
\
LC
\
RC

before you rotate - and you may have to do a right rotate to get into that state. But if
you're already in that state, you only need to do the left rotate.

height:

up vote 0 down vote

class Node
{
data value; //data is a custom data type
node right;
node left;
int height;
}

Now, we'll do a simple breadth-first traversal of the tree, and keep updating the height
value for each node:
int height (Node root)
{
Queue<Node> q = Queue<Node>();
Node lastnode;
//reset height
root.height = 0;

q.Enqueue(root);
while(q.Count > 0)
{
lastnode = q.Dequeue();
if (lastnode.left != null){
lastnode.left.height = lastnode.height + 1;
q.Enqueue(lastnode.left);
}

if (lastnode.right != null){
lastnode.right.height = lastnode.height + 1;
q.Enqueue(lastnode.right);
}
}
return lastnode.height; //this will return a 0-based height, so
just a root has a height of 0
}