You are on page 1of 149

TREES

CONTENTS
❖Terminology
❖Weighted Trees
❖Spanning Trees and Minimum Spanning Trees,
❖Prim’s and Kruskal’s Algorithm,
❖Isomorphism of Trees and Subtrees,
❖Prefix Codes
TERMINOLOGY
TERMINOLOGY
TERMINOLOGY
TERMINOLOGY
TERMINOLOGY
WEIGHTED TREE
A tree to whose nodes and/or edges labels (usually number) are assigned.
SPANNING TREES
A spanning tree is a subset of Graph G, which has all the vertices covered
with minimum possible number of edges.
Hence, a spanning tree does not have cycles and it cannot be
disconnected..
SPANNING TREES
By this definition, we can draw a conclusion that every connected and
undirected Graph G has at least one spanning tree. A disconnected graph
does not have any spanning tree, as it cannot be spanned to all its vertices.
SPANNING TREES
We found three spanning trees off one complete graph.
A complete undirected graph can have maximum nn-2 number of spanning
trees, where n is the number of nodes.
In the given example, n is 3, hence 33−2 = 3 spanning trees are possible.
GENERAL PROPERTIES OF
SPANNING TREE
A connected graph G can have more than one spanning tree.

All possible spanning trees of graph G, have the same number of edges
and vertices.

The spanning tree does not have any cycle (loops).

Removing one edge from the spanning tree will make the graph
disconnected, i.e. the spanning tree is minimally connected.

Adding one edge to the spanning tree will create a circuit or loop, i.e. the
spanning tree is maximally acyclic.
MATHEMATICAL PROPERTIES OF
SPANNING TREE
Spanning tree has n-1 edges, where n is the number of nodes (vertices).

From a complete graph, by removing maximum e - n + 1 edges, we can


construct a spanning tree.
APPLICATION OF SPANNING TREE
Spanning tree is basically used to find a minimum path to connect all
nodes in a graph. Common application of spanning trees are −

• Civil Network Planning


• Computer Network Routing Protocol

Let us understand this through a small example. Consider, city network as


a huge graph and now plans to deploy telephone lines in such a way that in
minimum lines we can connect to all city nodes. This is where the
spanning tree comes into picture.
MINIMUM SPANNING TREE
MINIMUM SPANNING TREE

A minimum spanning tree (MST) or minimum weight spanning tree is


spanning tree with the minimum possible total edge weight.
MINIMUM SPANNING TREE
KRUSKAL’S ALGORITHM
• Kruskal’s Algorithm is a famous greedy algorithm.
• It is used for finding the Minimum Spanning Tree (MST) of a given graph.
• To apply Kruskal’s algorithm, the given graph must be weighted, connected and
undirected.

Consider a graph G(V1,E1)


where V1 is set of all vertices,
E1 is set of all edges.
Let T is tree, T(V2, E2).
Initially V2= V1 AND E2 will be empty
sets.
KRUSKAL’S ALGORITHM
Step-1:
Sort all the edges from low weight to high weight.
Step-2:
Take the edge with the lowest weight and use it to connect the vertices of graph.
If adding an edge creates a cycle, then reject that edge and go for the next least weight edge.
Step-3:
Keep adding edges until all the vertices are connected and a Minimum Spanning Tree (MST) is
obtained.
KRUSKAL’S ALGORITHM
Edges sorted according to weights:
10,12,14,16,18,22,24,25,28
Sr. Edge considered Tree (V,E) Tree
No.

1 Initially there will no ({1,2,3,4,5,6,7},{


edge in tree. })
KRUSKAL’S ALGORITHM
Edges sorted according to weights:
10,12,14,16,18,22,24,25,28
Sr. Edge Tree (V,E) Tree
No. considered

3 10 ({1,2,3,4,5,6,7},
{10 })
KRUSKAL’S ALGORITHM
Edges sorted according to weights:
10,12,14,16,18,22,24,25,28
Sr. Edge Tree (V,E) Tree
No. considered

2 12 ({1,2,3,4,5,6,7},
{10,12 })
KRUSKAL’S ALGORITHM
Edges sorted according to weights:
10,12,14,16,18,22,24,25,28
Sr. Edge Tree (V,E) Tree
No. considered

4 14 ({1,2,3,4,5,6,7},
{10,12,14 })
KRUSKAL’S ALGORITHM
Edges sorted according to weights:
10,12,14,16,18,22,24,25,28
Sr. Edge Tree (V,E) Tree
No. considered

5 16 ({1,2,3,4,5,6,7},
{10,12,14,16 })
KRUSKAL’S ALGORITHM
Edges sorted according to weights:
10,12,14,16,18,22,24,25,28
Sr. Edge Tree (V,E) Tree
No. considered

6 18 ({1,2,3,4,5,6,7},
{10,12,14,16 })

Its creating a
cycle in tree.
So NOT added.
KRUSKAL’S ALGORITHM
Edges sorted according to weights:
10,12,14,16,18,22,24,25,28
Sr. Edge Tree (V,E) Tree
No. considered

7 22 ({1,2,3,4,5,6,7},
{10,12,14,16, 22 })
KRUSKAL’S ALGORITHM
Edges sorted according to weights:
10,12,14,16,18,22,24,25,28
Sr. Edge Tree (V,E) Tree
No. considered

8 24 ({1,2,3,4,5,6,7},
{10,12,14,16, 22 })

Its creating a
cycle in tree.
So NOT added.
KRUSKAL’S ALGORITHM
Edges sorted according to weights:
10,12,14,16,18,22,24,25,28
Sr. Edge Tree (V,E) Tree
No. considered

9 25 ({1,2,3,4,5,6,7},
{10,12,14,16, 22,25 })
PRIMS ALGORITHM
• Prim’s Algorithm is a famous greedy algorithm.
• It is used for finding the Minimum Spanning Tree (MST) of a given graph.
• To apply Prim’s algorithm, the given graph must be weighted, connected.

Consider a graph G(V1,E1)


where V1 is set of all vertices,
E1 is set of all edges.
Let T is tree, T(V2, E2).
Initially V2 and E2 will be empty sets.
PRIM’S ALGORITHM
Step-1: Randomly choose any vertex. Add it to tree T({v0},{ })
(The vertex connecting to the edge having least weight is usually selected.)
Step-2:
2.1 Find all the edges that connect the tree to new vertices.
2.2 Find the least weight edge among those edges and include it in the existing tree.
2.3 If including that edge creates a cycle, then reject that edge and look for the next
least weight edge.
Step-3:
• Keep repeating step-2 until all the vertices are included and Minimum Spanning Tree
(MST) is obtained.
PRIM’S ALGORITHM
Sr. Edges to Edge with Tree (V,E) Tree
No. be minimum
considere weight
d that is not
creating a
cycle

3 Select any vertex, ({6},{ })


add it to tree.

4 10,25 10 ({6,1},{10
})
PRIM’S ALGORITHM
Sr. Edges to Edge with Tree (V,E) Tree
No. be minimum
considere weight
d that is not
creating a
cycle

3 25,28 25 ({6,1,5},{10,25})

4 28, 22,24 22 ({6,1,5,4},


{10, 25, 22})
PRIM’S ALGORITHM
Sr. Edges to Edge with Tree (V,E) Tree
No. be minimum
considere weight
d that is not
creating a
cycle

3 28,24,12,1 12 ({6,1,5,4,3},
8 {10, 25, 22,12})

4 28,24,18, 16 ({6,1,5,4,3,2},
16 {10, 25, 22,12,16})
PRIM’S ALGORITHM
Sr. Edges to Edge with Tree (V,E) Tree
No. be minimum
considere weight
d that is not
creating a
cycle

3 28,24,18, 14 ({6,1,5,4,3,2,7},
14 {10, 25, 22,12,16,14})
PRIM’S ALGORITHM
Derive minimum spanning tree of given graph using Prim’s
algorithm.
ISOMORPHISM OF TREES
Two trees are called isomorphic if one can be obtained from another.
ISOMORPHISM OF TREES
Two trees are called isomorphic if one can be obtained from another.
Given two Binary Trees.
Check whether you can obtain one tree from
other by swapping left and right children of
several nodes.
Any number of nodes at any level can have
their children swapped.
For example, the following two trees are
isomorphic with the following sub-trees
flipped: 2 and 3, NULL and 6, 7 and 8.
ISOMORPHISM OF TREES
Algorithm by Aho, Hopcroft and Ullman (AHU algorithm)

The AHU algorithm associates with each vertex a tuple that describes the complete history
of its descendants.
Knuth tuples example
ISOMORPHISM OF TREES
Algorithm by Aho, Hopcroft and Ullman (AHU algorithm)

The AHU algorithm associates with each vertex a Knuth tuple that describes the complete
history of its descendants.
ISOMORPHISM OF TREES
Algorithm by Aho, Hopcroft and Ullman (AHU algorithm)

Convert parenthetical tuples to canonical names.


Drop all “0”-s.
Replace “(” and “)” with “1” and “0” respectively.
ISOMORPHISM OF TREES
Algorithm by Aho, Hopcroft and Ullman (AHU algorithm)
HUFFMAN CODES
• Used to encode data efficiently.

• It is a widely used and beneficial technique for compressing data.

• Huffman's greedy algorithm uses a table of the frequencies of occurrences of each


character to build up an optimal way of representing each character as a binary
string.

• Suppose we have 105 characters in a data file. Normal Storage: 8 bits per character
(ASCII) - 8 x 105 bits in a file.
HUFFMAN CODES
• Suppose only six characters appear in the file:

If we use Fixed length Code: Each letter represented by an equal number of bits. With a fixed
length code, at least 3 bits per character: a 000

b 001

c 010
5 5
For a file with 10 characters, we need 3 x 10 bits.
d 011

e 100

f 101
HUFFMAN CODES
• Suppose only six characters appear in the file:

If we use a variable-length code: It can do considerably better than a fixed-length code,


by giving many characters short code words and infrequent character long codewords.

a 0 For a file of 105 characters


b 101
Number of bits = (45 x 1 + 13 x 3 + 12 x 3 + 16 x 3 + 9 x 4 + 5 x 4) x 1000
c 100
= 2.24 x 105bits
d 111

e 1101
How much memory is saved??
f 1100
HUFFMAN CODES
• Huffman Coding is a famous Greedy Algorithm.
• It is used for the lossless compression of data.
• It uses variable length encoding.
• It assigns variable length code to all the characters.
• The code length of a character depends on how frequently it occurs in the given text.
• The character which occurs most frequently gets the smallest code.
• The character which occurs least frequently gets the largest code.
• It is also known as Huffman Encoding.
HUFFMAN CODES
• Huffman Coding is a famous Greedy Algorithm.
• It is used for the lossless compression of data.
• It uses variable length encoding.
• It assigns variable length code to all the characters.
• The code length of a character depends on how frequently it occurs in the given text.
• The character which occurs most frequently gets the smallest code.
• The character which occurs least frequently gets the largest code.
• It is also known as Huffman Encoding.
HUFFMAN CODES
Prefix Rule-

• Huffman Coding implements a rule known as a prefix rule.


• This is to prevent the ambiguities while decoding.
• It ensures that the code assigned to any character is not a prefix of the code assigned to any
other character.
HUFFMAN CODES
Major Steps in Huffman Coding-

• There are two major steps in Huffman Coding-


1. Building a Huffman Tree from the input characters.
2. Assigning code to the characters by traversing the Huffman Tree.
HUFFMAN CODES
Building a Huffman Tree from the input characters
Step-1: Create a leaf node for each character of the text.
Leaf node of a character contains the occurring frequency of that character.
Step-2: Arrange all the nodes in increasing order of their frequency value.
Step-3: Considering the first two nodes having minimum frequency,
Create a new internal node.
The frequency of this new node is the sum of frequency of those two nodes.
Make the first node as a left child and the other node as a right child of the newly created
node.
Step-4: Keep repeating Step-02 and Step-03 until all the nodes form a single tree.
The tree finally obtained is the desired Huffman Tree.
HUFFMAN CODES
A file contains the following characters with the frequencies as shown. If Huffman Coding is used for
data compression, determine-
Huffman Code for each character
Average code length
Length of Huffman encoded message (in bits)
Characters Frequencies
a 10
e 15
i 12
o 3
u 4
s 13
t 1
HUFFMAN CODES
HUFFMAN CODES
HUFFMAN CODES
HUFFMAN CODES
HUFFMAN CODES
HUFFMAN CODES
HUFFMAN CODES
HUFFMAN CODES
HUFFMAN CODES
HUFFMAN CODES
Second major step is to assign weight to all the edges of the constructed Huffman Tree.

Let us assign weight ‘0’ to the left edges and weight ‘1’ to the right edges.

Rules
1. If you assign weight ‘0’ to the left edges, then assign weight ‘1’ to the right edges.
2. If you assign weight ‘1’ to the left edges, then assign weight ‘0’ to the right edges.
3. Any of the above two conventions may be followed.
4. But follow the same convention at the time of decoding that is adopted at the time of
encoding.
HUFFMAN CODES

After assigning weight to all the edges,


the modified Huffman Tree is-
HUFFMAN CODES
1. Huffman Code For Characters-

To write Huffman Code for any character,


traverse the Huffman Tree from root node to the leaf node of that character.

Following this rule, the Huffman Code for each character is-

a = 111 e = 10 i = 00 o = 11001
u = 1101 s = 01 t = 11000

From here, we can observe-

Characters occurring less frequently in the text are assigned the larger code.
Characters occurring more frequently in the text are assigned the smaller code.
Given

HUFFMAN CODES Characters


a
Frequencies
10
e 15
i 12

Obtained from Huffman tree o 3


a = 111 e = 10 i = 00 o = 11001 u 4
u = 1101 s = 01 t = 11000 s 13
t 1

2. Average Code Length-

Using formula-01, we have-


Average code length
= ∑ ( frequencyi x code lengthi ) / ∑ ( frequencyi )
= { (10 x 3) + (15 x 2) + (12 x 2) + (3 x 5) + (4 x 4) + (13 x 2) + (1 x 5) } / (10 + 15 + 12 + 3 + 4 + 13
+ 1)
Given

HUFFMAN CODES Characters


a
Frequencies
10
e 15
i 12

Obtained from Huffman tree o 3


a = 111 e = 10 i = 00 o = 11001 u 4
u = 1101 s = 01 t = 11000 s 13
t 1

3. Length of Huffman Encoded Message-

Using formula-02, we have-


Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
= 58 x 2.52
= 146.16
≅ 147 bits
HUFFMAN CODES

Example: Find an optimal Huffman Code for the following set of frequencies:
a: 50 b: 25 c: 15 d: 40 e: 75
Tree isomorphism

Alexander Smal

St.Petersburg State University of Information Technologies, Mechanics and Optics

Joint Advanced Student School 2008

1 / 22
Motivation

In some applications the chemical structures are often trees with


millions of vertices:
∙ gene splicing,
∙ protein analysis,
∙ molecular biology.
Difference between O(n), O(n log n), and O(n2 ) isomorphism
algorithms is not just theoretical importance.

2 / 22
Graph isomorphism
Definition
Isomorphism of graphs G1 (V1 , E1 ) and G2 (V2 , E2 ) is a bijection
between the vertex sets 𝜙 : V1 → V2 such that

∀u, v ∈ V1 (u, v ) ∈ E1 ⇔ (𝜙(u), 𝜙(v )) ∈ E2 .

3 / 22
Graph isomorphism
Definition
Isomorphism of graphs G1 (V1 , E1 ) and G2 (V2 , E2 ) is a bijection
between the vertex sets 𝜙 : V1 → V2 such that

∀u, v ∈ V1 (u, v ) ∈ E1 ⇔ (𝜙(u), 𝜙(v )) ∈ E2 .

Facts
∙ No algorithm, other than brute force, is known for testing
whether two arbitrary graphs are isomorphic.

3 / 22
Graph isomorphism
Definition
Isomorphism of graphs G1 (V1 , E1 ) and G2 (V2 , E2 ) is a bijection
between the vertex sets 𝜙 : V1 → V2 such that

∀u, v ∈ V1 (u, v ) ∈ E1 ⇔ (𝜙(u), 𝜙(v )) ∈ E2 .

Facts
∙ No algorithm, other than brute force, is known for testing
whether two arbitrary graphs are isomorphic.
∙ It is still an open question (!) whether graph isomorphism is
NP complete.

3 / 22
Graph isomorphism
Definition
Isomorphism of graphs G1 (V1 , E1 ) and G2 (V2 , E2 ) is a bijection
between the vertex sets 𝜙 : V1 → V2 such that

∀u, v ∈ V1 (u, v ) ∈ E1 ⇔ (𝜙(u), 𝜙(v )) ∈ E2 .

Facts
∙ No algorithm, other than brute force, is known for testing
whether two arbitrary graphs are isomorphic.
∙ It is still an open question (!) whether graph isomorphism is
NP complete.
∙ Polynomial time isomorphism algorithms for various graph
subclasses such as trees are known.

3 / 22
Rooted trees
Definition
Rooted tree (V , E , r ) is a tree (V , E ) with selected root r ∈ V .

4 / 22
Rooted trees
Definition
Rooted tree (V , E , r ) is a tree (V , E ) with selected root r ∈ V .
Definition
Isomorphism of rooted trees T1 (V1 , E1 , r1 ) and T2 (V2 , E2 , r2 ) is
a bijection between the vertex sets 𝜙 : V1 → V2 such that

∀u, v ∈ V1 (u, v ) ∈ E1 ⇔ (𝜙(u), 𝜙(v )) ∈ E2


and 𝜙(r1 ) = r2 .

4 / 22
Rooted trees
Definition
Rooted tree (V , E , r ) is a tree (V , E ) with selected root r ∈ V .
Definition
Isomorphism of rooted trees T1 (V1 , E1 , r1 ) and T2 (V2 , E2 , r2 ) is
a bijection between the vertex sets 𝜙 : V1 → V2 such that

∀u, v ∈ V1 (u, v ) ∈ E1 ⇔ (𝜙(u), 𝜙(v )) ∈ E2


and 𝜙(r1 ) = r2 .

4 / 22
Rooted trees
Definition
Rooted tree (V , E , r ) is a tree (V , E ) with selected root r ∈ V .
Definition
Isomorphism of rooted trees T1 (V1 , E1 , r1 ) and T2 (V2 , E2 , r2 ) is
a bijection between the vertex sets 𝜙 : V1 → V2 such that

∀u, v ∈ V1 (u, v ) ∈ E1 ⇔ (𝜙(u), 𝜙(v )) ∈ E2


and 𝜙(r1 ) = r2 .
Example
T1 and T2 are isomorphic as graphs . . .

a B
T1 b A C T2
c
4 / 22
Rooted trees
Definition
Rooted tree (V , E , r ) is a tree (V , E ) with selected root r ∈ V .
Definition
Isomorphism of rooted trees T1 (V1 , E1 , r1 ) and T2 (V2 , E2 , r2 ) is
a bijection between the vertex sets 𝜙 : V1 → V2 such that

∀u, v ∈ V1 (u, v ) ∈ E1 ⇔ (𝜙(u), 𝜙(v )) ∈ E2


and 𝜙(r1 ) = r2 .
Example
T1 and T2 are isomorphic as graphs but not as rooted trees!

a B
T1 b A C T2
c
4 / 22
Rooted trees (part 2)
Lemma
If there is O(n) algorithm for rooted trees isomorphism, then there
is O(n) algorithm for ordinary trees isomorphism.

5 / 22
Rooted trees (part 2)
Lemma
If there is O(n) algorithm for rooted trees isomorphism, then there
is O(n) algorithm for ordinary trees isomorphism.

Proof.
1 Let A to be O(n) algorithm for rooted trees.

5 / 22
Rooted trees (part 2)
Lemma
If there is O(n) algorithm for rooted trees isomorphism, then there
is O(n) algorithm for ordinary trees isomorphism.

Proof.
1 Let A to be O(n) algorithm for rooted trees.
2 Let T1 and T2 to be ordinary trees.

5 / 22
Rooted trees (part 2)
Lemma
If there is O(n) algorithm for rooted trees isomorphism, then there
is O(n) algorithm for ordinary trees isomorphism.

Proof.
1 Let A to be O(n) algorithm for rooted trees.
2 Let T1 and T2 to be ordinary trees.
3 Lets find centers of this trees. There are three cases:

5 / 22
Rooted trees (part 2)
Lemma
If there is O(n) algorithm for rooted trees isomorphism, then there
is O(n) algorithm for ordinary trees isomorphism.

Proof.
1 Let A to be O(n) algorithm for rooted trees.
2 Let T1 and T2 to be ordinary trees.
3 Lets find centers of this trees. There are three cases:
1 each tree has only one center (c1 and c2 respectively)
return A(T1 , c1 , T2 , c2 )

5 / 22
Rooted trees (part 2)
Lemma
If there is O(n) algorithm for rooted trees isomorphism, then there
is O(n) algorithm for ordinary trees isomorphism.

Proof.
1 Let A to be O(n) algorithm for rooted trees.
2 Let T1 and T2 to be ordinary trees.
3 Lets find centers of this trees. There are three cases:
1 each tree has only one center (c1 and c2 respectively)
return A(T1 , c1 , T2 , c2 )
2 each tree has exactly two centers (c1 , c1′ and c2 , c2′
respectively)
return A(T1 , c1 , T2 , c2 ) or A(T1 , c1′ , T2 , c2 )

5 / 22
Rooted trees (part 2)
Lemma
If there is O(n) algorithm for rooted trees isomorphism, then there
is O(n) algorithm for ordinary trees isomorphism.

Proof.
1 Let A to be O(n) algorithm for rooted trees.
2 Let T1 and T2 to be ordinary trees.
3 Lets find centers of this trees. There are three cases:
1 each tree has only one center (c1 and c2 respectively)
return A(T1 , c1 , T2 , c2 )
2 each tree has exactly two centers (c1 , c1′ and c2 , c2′
respectively)
return A(T1 , c1 , T2 , c2 ) or A(T1 , c1′ , T2 , c2 )
3 trees has different number of centers
return False

5 / 22
Diameter and center

Definition
The diameter of tree is the length of the longest path.

6 / 22
Diameter and center

Definition
The diameter of tree is the length of the longest path.

Definition
A center is a vertex v such that the longest path from v to a leaf
is minimal over all vertices in the tree.

6 / 22
Diameter and center

Definition
The diameter of tree is the length of the longest path.

Definition
A center is a vertex v such that the longest path from v to a leaf
is minimal over all vertices in the tree.

Algorithm
1: Choose a random root r .
2: Find a vertex v1 — the farthest form r .
3: Find a vertex v2 — the farthest form v1 .
4: Diameter is a length of path from v1 to v2 .
5: Centers are median elements of path from v1 to v2 .

6 / 22
Diameter and center

Definition
The diameter of tree is the length of the longest path.

Definition
A center is a vertex v such that the longest path from v to a leaf
is minimal over all vertices in the tree.

Algorithm
1: Choose a random root r .
2: Find a vertex v1 — the farthest form r .
3: Find a vertex v2 — the farthest form v1 .
4: Diameter is a length of path from v1 to v2 .
5: Centers are median elements of path from v1 to v2 .

It is O(n) algorithm.

6 / 22
The idea
Let’s try to find complete invariant of rooted trees isomorphism.

7 / 22
The idea
Let’s try to find complete invariant of rooted trees isomorphism.
Definition
Isomorphism invariant is a function f (T ) such that
f (T1 ) = f (T2 ) for all pairs of isomorphic trees T1 and T2 .

7 / 22
The idea
Let’s try to find complete invariant of rooted trees isomorphism.
Definition
Isomorphism invariant is a function f (T ) such that
f (T1 ) = f (T2 ) for all pairs of isomorphic trees T1 and T2 .

Definition
Complete isomorphism invariant is a function f (T ) such that
two trees T1 and T2 are isomorphic if and only if f (T1 ) = f (T2 ).

7 / 22
The idea
Let’s try to find complete invariant of rooted trees isomorphism.
Definition
Isomorphism invariant is a function f (T ) such that
f (T1 ) = f (T2 ) for all pairs of isomorphic trees T1 and T2 .

Definition
Complete isomorphism invariant is a function f (T ) such that
two trees T1 and T2 are isomorphic if and only if f (T1 ) = f (T2 ).

So if we find complete isomorphism invariant we can obtain


algorithm from it.

7 / 22
The idea
Let’s try to find complete invariant of rooted trees isomorphism.
Definition
Isomorphism invariant is a function f (T ) such that
f (T1 ) = f (T2 ) for all pairs of isomorphic trees T1 and T2 .

Definition
Complete isomorphism invariant is a function f (T ) such that
two trees T1 and T2 are isomorphic if and only if f (T1 ) = f (T2 ).

So if we find complete isomorphism invariant we can obtain


algorithm from it.

Note
Starting from the next slide tree always means rooted tree!

7 / 22
Candidate 1
Observation
The level number of a vertex is a tree isomorphism invariant.

8 / 22
Candidate 1
Observation
The level number of a vertex is a tree isomorphism invariant.

Conjecture
Two trees are isomorphic if and only if they have the same number
of levels and the same number of vertices on each level.

8 / 22
Candidate 1
Observation
The level number of a vertex is a tree isomorphism invariant.

Conjecture
Two trees are isomorphic if and only if they have the same number
of levels and the same number of vertices on each level.

Observation
The number of the leaves is a tree isomorphism invariant.

8 / 22
Candidate 1
Observation
The level number of a vertex is a tree isomorphism invariant.

Conjecture
Two trees are isomorphic if and only if they have the same number
of levels and the same number of vertices on each level.

Observation
The number of the leaves is a tree isomorphism invariant.

Contrary instance

a A
T1 1 2 ··· n 1 2 ··· n T2
d e D E

8 / 22
Candidate 2

What’s wrong with candidate 1?


We didn’t take into account the degree spectrum of a tree.

9 / 22
Candidate 2

What’s wrong with candidate 1?


We didn’t take into account the degree spectrum of a tree.

Definition
Degree spectrum of tree is the sequence of non-negative integers
{dj }, where dj is the number of vertices that have j children.

9 / 22
Candidate 2

What’s wrong with candidate 1?


We didn’t take into account the degree spectrum of a tree.

Definition
Degree spectrum of tree is the sequence of non-negative integers
{dj }, where dj is the number of vertices that have j children.

Conjecture
Two trees are isomorphic if and only if they have the same degree
spectrum.

9 / 22
Candidate 2 (part 2)
Observation
Since a tree isomorphism preserves longest paths from the root,
the number of levels in a tree is a tree isomorphism invariant.

10 / 22
Candidate 2 (part 2)
Observation
Since a tree isomorphism preserves longest paths from the root,
the number of levels in a tree is a tree isomorphism invariant.
Contrary instance

a A

b c B C

d e 1 D E
..
T1 1 . T2
..
. n

n
10 / 22
Candidate 3

Conjecture
Two trees are isomorphic if and only if they have the same degree
spectrum at each level.

11 / 22
Candidate 3

Conjecture
Two trees are isomorphic if and only if they have the same degree
spectrum at each level.

If two trees have the same degree spectrum at each level, then
they must automatically have the same number of levels, the same
number of vertices at each level, and the same global degree
spectrum!

11 / 22
Candidate 3

Conjecture
Two trees are isomorphic if and only if they have the same degree
spectrum at each level.

If two trees have the same degree spectrum at each level, then
they must automatically have the same number of levels, the same
number of vertices at each level, and the same global degree
spectrum!

Observation
The number of leaf descendants of a vertex and the level number
of a vertex are both tree isomorphism invariants.

11 / 22
Candidate 3 (part 2)

Contrary instance
level degree spectrum

a A (0, 0, 1, 0, . . .)

b c B C (0, 1, 1, 0, . . .)

d e f D E F (2, 0, 1, 0, . . .)

T1 g 1 G 1 T2 (1, 1, 0, 0, . . .)
.. .. ..
. . .
n n (1, 0, 0, 0, . . .)

12 / 22
AHU algorithm
Algorithm by Aho, Hopcroft and Ullman
∙ Determine tree isomorphism in time O(|V |).
∙ Uses complete history of degree spectrum of the vertex
descendants as a complete invariant.

13 / 22
AHU algorithm
Algorithm by Aho, Hopcroft and Ullman
∙ Determine tree isomorphism in time O(|V |).
∙ Uses complete history of degree spectrum of the vertex
descendants as a complete invariant.

The idea of AHU algorithm


The AHU algorithm associates with each vertex a tuple that
describes the complete history of its descendants.

13 / 22
AHU algorithm
Algorithm by Aho, Hopcroft and Ullman
∙ Determine tree isomorphism in time O(|V |).
∙ Uses complete history of degree spectrum of the vertex
descendants as a complete invariant.

The idea of AHU algorithm


The AHU algorithm associates with each vertex a tuple that
describes the complete history of its descendants.

Hard question
Why our previous invariants are not complete?

13 / 22
AHU algorithm
Algorithm by Aho, Hopcroft and Ullman
∙ Determine tree isomorphism in time O(|V |).
∙ Uses complete history of degree spectrum of the vertex
descendants as a complete invariant.

The idea of AHU algorithm


The AHU algorithm associates with each vertex a tuple that
describes the complete history of its descendants.

Hard question
Why our previous invariants are not complete?

Let’s discuss AHU algorithm. We start from O(|V |2 ) version and


then I tell how to make it faster (O(|V |)).
13 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

B C D

E F G

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

B C D (0)

E (0) F (0) G (0)

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

B ((0)) C D (0)

E (0) F (0) G (0)

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

B ((0)) C ( (0) (0) ) D (0)

E (0) F (0) G (0)

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

B ((0)) C ( (0) (0) ) D (0)

E (0) F (0) G (0)

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

B ((0)) C ( (0) (0) ) D (0)

E (0) F (0) G (0)

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

A ( ((0)) ((0)(0)) (0) )

B ((0)) C ( (0) (0) ) D (0)

E (0) F (0) G (0)

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

A ( ((0)) ((0)(0)) (0) )

B ((0)) C ( (0) (0) ) D (0)

E (0) F (0) G (0)

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

A ( ((0)) ((0)(0)) (0) )

B ((0)) C ( (0) (0) ) D (0)

E (0) F (0) G (0)

14 / 22
Understanding AHU algorithm

Knuth tuples
Let’s assign parenthetical tuples to all tree vertices.

Knuth tuples example

A ( ((0)) ((0)(0)) (0) )

B ((0)) C ( (0) (0) ) D (0)

E (0) F (0) G (0)

14 / 22
Understanding AHU algorithm (part 2)

There is algorithm Assign-Knuth-Tuples that visits every


vertex once or twice.

Assign-Knuth-Tuples(v)
1: if v is a leaf then
2: Give v the tuple name (0)
3: else
4: for all child w of v do
5: Assign-Knuth-Tuples(w )
6: end for
7: end if
8: Concatenate the names of all children of v to temp
9: Give v the tuple name temp

15 / 22
Understanding AHU algorithm (part 3)
Observation
There is no order on parenthetical tuples.

16 / 22
Understanding AHU algorithm (part 3)
Observation
There is no order on parenthetical tuples.

Example

A ( (0) ((0)) ) a ( ((0)) (0) )

B (0) C ((0)) b ((0)) c (0)

D (0) d (0)

16 / 22
Understanding AHU algorithm (part 3)
Observation
There is no order on parenthetical tuples.

Example

A ( (0) ((0)) ) a ( ((0)) (0) )

B (0) C ((0)) b ((0)) c (0)

D (0) d (0)

Let’s convert parenthetical tuples to canonical names. We should


drop all “0”-s and replace “(” and “)” with “1” and “0”
respectively.
16 / 22
Understanding AHU algorithm (part 3)
Observation
There is no order on parenthetical tuples.

Example

A 1 10 1100 0 a 1 1100 10 0

B 10 C 1100 b 1100 c 10

D 10 d 10

Let’s convert parenthetical tuples to canonical names. We should


drop all “0”-s and replace “(” and “)” with “1” and “0”
respectively.
16 / 22
Understanding AHU algorithm (part 3)
Observation
There is no order on parenthetical tuples.

Example

A 1 10 1100 0 a 1 10 1100 0

B 10 C 1100 b 1100 c 10

D 10 d 10

Let’s convert parenthetical tuples to canonical names. We should


drop all “0”-s and replace “(” and “)” with “1” and “0”
respectively.
16 / 22
Understanding AHU algorithm (part 4)

Assign-Canonical-Names(v)
1: if v is a leaf then
2: Give v the tuple name “10”
3: else
4: for all child w of v do
5: Assign-Canonical-Names(v )
6: end for
7: end if
8: Sort the names of the children of v
9: Concatenate the names of all children of v to temp
10: Give v the name 1temp0

17 / 22
Understanding AHU algorithm (part 5)
We should discuss some important questions.

18 / 22
Understanding AHU algorithm (part 5)
We should discuss some important questions.
Invariant?
Is canonical name of a root a tree isomorphism invariant?

18 / 22
Understanding AHU algorithm (part 5)
We should discuss some important questions.
Invariant?
Is canonical name of a root a tree isomorphism invariant?
Complete invariant?
Is canonical name of a root a complete tree isomorphism invariant?

18 / 22
Understanding AHU algorithm (part 5)
We should discuss some important questions.
Invariant?
Is canonical name of a root a tree isomorphism invariant?
Complete invariant?
Is canonical name of a root a complete tree isomorphism invariant?
AHU-Tree-Isomorphism(T1 , T2 )
1: r1 ← root(T1 )
2: r2 ← root(T2 )
3: Assign-Canonical-Names(r1 )
4: Assign-Canonical-Names(r2 )
5: if name(r1 ) = name(r2 ) then
6: return True
7: else
8: return False
9: end if
18 / 22
AHU algorithm improvement
Observation
To compute the root name of a tree of n vertices in one long
strand, takes time proportional to 1 + 2 + · · · + n, which is Ω(n2 ).

19 / 22
AHU algorithm improvement
Observation
To compute the root name of a tree of n vertices in one long
strand, takes time proportional to 1 + 2 + · · · + n, which is Ω(n2 ).
Observation
For all levels i, the canonical name of level i is a tree isomorphism
invariant.

19 / 22
AHU algorithm improvement
Observation
To compute the root name of a tree of n vertices in one long
strand, takes time proportional to 1 + 2 + · · · + n, which is Ω(n2 ).
Observation
For all levels i, the canonical name of level i is a tree isomorphism
invariant.
Observation
Two trees T1 and T2 are isomorphic if and only if for all levels i
canonical level names of T1 and T2 are identical.

19 / 22
AHU algorithm improvement
Observation
To compute the root name of a tree of n vertices in one long
strand, takes time proportional to 1 + 2 + · · · + n, which is Ω(n2 ).
Observation
For all levels i, the canonical name of level i is a tree isomorphism
invariant.
Observation
Two trees T1 and T2 are isomorphic if and only if for all levels i
canonical level names of T1 and T2 are identical.
The idea 1
Assign canonical names for level, sort by level, and check by level
that the canonical level names agree.

19 / 22
AHU algorithm improvement
Observation
To compute the root name of a tree of n vertices in one long
strand, takes time proportional to 1 + 2 + · · · + n, which is Ω(n2 ).
Observation
For all levels i, the canonical name of level i is a tree isomorphism
invariant.
Observation
Two trees T1 and T2 are isomorphic if and only if for all levels i
canonical level names of T1 and T2 are identical.
The idea 1
Assign canonical names for level, sort by level, and check by level
that the canonical level names agree.
The idea 2
Assign canonical names for level and if canonical level names agree
than replace canonical names with integers.
19 / 22
AHU algorithm example

Example

A a

B C b c 0

D E d e

20 / 22
AHU algorithm example

Example

A a

B C b c 0

D 0 E 0 d 0 e 0

20 / 22
AHU algorithm example

Example

A a

B C b c 0

D 1 E 1 d 1 e 1

1,1 1 1,1

20 / 22
AHU algorithm example

Example

A a

B C b c 0

D 1 E 1 d 1 e 1

1,1 1 1,1

20 / 22
AHU algorithm example

Example

A a

B 0 C 11 b 11 c 0

D 1 E 1 d 1 e 1

20 / 22
AHU algorithm example

Example

A a

B 1 C 2 b 2 c 1

D 1 E 1 d 1 e 1

1,2 1 1,2

20 / 22
AHU algorithm example

Example

A a

B 1 C 2 b 2 c 1

D 1 E 1 d 1 e 1

1,2 1 1,2

20 / 22
AHU algorithm example

Example

A 12 a 12

B 1 C 2 b 2 c 1

D 1 E 1 d 1 e 1

20 / 22
AHU algorithm example

Example

A 1 a 1

B 1 C 2 b 2 c 1

D 1 E 1 d 1 e 1

1 1 1

20 / 22
AHU algorithm example

Example

A 1 a 1

B 1 C 2 b 2 c 1

D 1 E 1 d 1 e 1

1 1 1

20 / 22
AHU algorithm example

Example

A 1 a 1

B 1 C 2 b 2 c 1

D 1 E 1 d 1 e 1

OK 1

20 / 22
Resume

Resume
∙ We have made three unsuccessful attempts to construct
complete tree isomorphism invariant.
∙ We discussed O(|V |2 ) version of AHU algorithm.
∙ We discussed ways of AHU algorithm improvement to make it
work in O(|V |) time.

21 / 22
Thank you for your attention!
Any questions?

22 / 22

You might also like