This action might not be possible to undo. Are you sure you want to continue?

# Union Find Algorithm Overview

Pi19404

February 17, 2013

Contents

Contents

Union Find Algorithm Overview

0.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.2 Equivalence Relation . . . . . . . . . . . . . . . . . . . . . . 0.2.1 Formal Definition . . . . . . . . . . . . . . . . . . . 0.2.2 Equivalence Relation and Disjoint Subsets . . . 0.2.3 Equivalence Class . . . . . . . . . . . . . . . . . . . . 0.3 Union Find Algorithm . . . . . . . . . . . . . . . . . . . . . 0.3.1 Quick Find . . . . . . . . . . . . . . . . . . . . . . . . 0.3.2 Quick Union . . . . . . . . . . . . . . . . . . . . . . . 0.3.3 Weighted Quick Unions . . . . . . . . . . . . . . . . 0.3.4 Weighted Quick Unions with path compression 0.4 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

3 3 4 4 5 5 6 7 8 9 9 9

2|9

Union Find Algorithm Overview

**Union Find Algorithm Overview
**

0.1 Abstract

Consider a set of N objects.The objects are labelled from 0 to N. A pair of objects may be connected to each other.Thus input is a series sequence of pairs of integers where each integers represents a object of some type. We need to define what is meant by two integers p and q are connected to each other :-

0.2 Equivalence Relation

Equality is a equivalence relation.A eqivalence relation is a weak form of equality. Consider certain objects which share some similar properties and are dissimilar in other aspects. We may consider such objects belonging to a specific class characterized by the similarity properties. If theres is no need to distinguish between objects that behave in the same way wrt certian properties ie we only require information about the class membership. which class the object belongs to. If objects that are not absolutely identical but are equal in one and only respect we employ the equivalence relation to define such a relationship An equivalence relation can be considered as formal tool for disregarding differences between various objects and treating them as equals.

3|9

Union Find Algorithm Overview

0.2.1 Formal Definition

In mathematics we require to investigate the relationship between objects. Let a element ’a’ of set A is related to element ’b’ of set B in some way. the ordered pain(a,b) can be distinguished by this relation. If A and B are two sets ,A relation R from A into B is a subset of cartesian producs A x B . Let A be a non empty set .A Relation R on A (subset of R of A x A) is called a equivalence relation on A is the following hold. 1. Reflexive : 2. Symmetric :

(a; a)

P R; Va P A P R; then

(b; a)

(a; b)

PR

(a; c)

3. Transitive :(a; b) P R; (b; c) P R; then This can also be expressed as

PR

Let A be a non empty set .A Relation R on A (subset of R of A x A) is called a equivalence relation on A is the following hold for all a; b; c P A. 1. Reflexive : 2. Symmetric :

a

a

a

b P R; then

b

b

a

a

3. Transitive :a b and

c then

c

A equivalence relation is the one which satisfies reflexive ,symmetric and transitive relations.

**0.2.2 Equivalence Relation and Disjoint Subsets
**

An equivalence relation on a set X gives rise to a partition of X into disjoint subsets. X is a union of certain subsets of X and that are mutually disjoint. Similarily if we have a partition of a non empty subset of X into pairwise disjoint subsets then there is a equivalence relation on X.

4|9

Union Find Algorithm Overview A partition of X is a set P of nonempty subsets of X, such that every element of X is an element of a single element of P. The elements of P are pairwise disjoint and their Union is X.

0.2.3 Equivalence Class

Let be an equivalence relation on a nonempty set X,a P X .The equivalence class of a is defined to be the set of all elements of A that are equivalent to a. The equivalence class of element a is denoted by X jx $ ag and [a] P X . The set of all equivalence classes is denoted by

X=

[a]

,[a]

:=

fx P

$

The equivalence class form a partition of set X. X is the union of equivalence of equivalence classes and distinct equivalence classes are disjoint. if X is a union of non empty mutually disjoint subsets ,then there is a equivalence relation of X such that each subset defines a equivalence class.

**0.3 Union Find Algorithm
**

Equivalence relation occurs in many applications examples for example if we have a bidirectional graph and the connected verties of the graph are defined by a equivalence relation.Thus connectivity is equivalence relation in context of bi-directional graph. Thus set of connected vertices of a graph satisfy equivalence relation and form a subset of set of all the vertices of the graph. In graphs equivalence relations partitions the graph into connected components. The union find algorithm returns the subset or connected component the element bleongs to .This information can also be used

5|9

Union Find Algorithm Overview to checks if two elements of the graph satisfy the equivalence relation,belong to same equivalence class,are connected. To construct a connected grap we require to join two connected components.This represents union of two equivalent classes. A element is added to the two disjoint equivalent classes so that they are combined to single equivalent class. Thus we require a basic union find data structure to support the following operations :find(A) - find what equivalence class element A belongs union(A,B) - merger equivalence class of elements A and B. These basic operations can be used to build more complex operations.

0.3.1 Quick Find

Let us consider a quick find algorithm .The first task is to define the goals/operations to be supported by the algorithm.To define a data structure that supports these operations. Let N be the number of elements of graph. Let each component be indexed by a id which defines the equivalence class the object belongs to.Thus if we have N elements of the graph. We have a index array of length N that contains the index of the equivalence class. It can have atmost of N and atleast 1 index Thus the data structure used by the quick find algorithm is a integer array of size N. Initially all the elements of the graph belongs to individual equivalence class and are defined by integers 0 - N. Thus the constructor for the method will initialize a data structure with N elements with index id from 0 to N. The find operation checks the id of the two elements and return a boolean true if both the id’s are equal else it returns false. The union operation of elements a,b belonging to disjoint equiv-

6|9

Union Find Algorithm Overview alence class changes the index of all elements of equivalence class a and b to same index. The union operations is require to construct the graph defining the problem. We will use the following convention merge(a,b) we will assign the index of equivalence class of b to a. Thus for all the elements of equivlance class a index are changed to index of b. we will consider the performace cost for above algorithm.The cost is in terms of memory access. For initialization and union the cost is N while for find operation it is 1. If we have to perform M union operations it will lead to MN memory accesses. Thus M=N it takes quadratic time.Thus quick find proves to be very expensive for larger number of unions.

0.3.2 Quick Union

Consider the following alternative for quick find algorithm. The data structure to be used is also a integer array of size N. The the array element id[i] contains parent of element i. To determine if two objects are in the same equivalence class we follow the parents until we find object that points to itself. This element is called the root of the tree.Each element has a root associated with it. If both the objects belong to same equivalence class will lead to same root element. To merger two equivalence classes containing elements p and q set the id of p’s root to the id of q’s root id. the root of elements of class which p belongs to becomes the root of class q belonged to.This is by changing just one entry in the array two large sets can be merged. The initialization consits of initilization of array of N elements whose root is itself. we will follow the following convention that if union(a,b) then find the root of a and root of b and assign the root of b as parent

7|9

Union Find Algorithm Overview of root of a. Again the performance measurement is number of memory accesses. initialization is N memory accesses. To find the root worst case scenario is N. The find operation worst case scenario is 2N. The union operation worst case is 2N. As tree gets taller the find operation gets expensive However advantage is in terms of union operation. However this still is too slow an algorithm for large problems.

**0.3.3 Weighted Quick Unions
**

One approach to improve the algorithm is to take steps to avoid tall trees. This can be done by keeping track of size of trees/equivalence class by keeping track of number of objects in tree/equivalence class. Thus while performing the merge operations we will assign the root of equivlance class with smaller size the root of large equivalence class as the parent. The weighted union gurantees than the tress as not too tall. This leads to average distance of element from the root than quick union algorithm reducing the number of memory accesses etc. The data structure contains extra size array which contains the number of objects of tree rooted in i. The find operation depends on find the roots of element and depends on the depth of the elements from the root. The depth of any node in the tree is

log2 (N )

At each merge operation depth of tree increases by 1. Consider tree T1 containing x and another tree T2. Let jT 2j jT 1j .If x is merged with element of T2 depth of element x from root increases by 1. The size of tree containing atleast increases by a factor of 2. In the worst case scenario we get size of trees as

20 ; 21 ; 22 ; 23 ; : : : ; 2k

8|9

Union Find Algorithm Overview .Thus size of tree can double atmost k times ,log2 (N ). Thus depth of any node is k or log2 (N ).

initialization takes proportial time N. The union and find both take constant times log2 (N ) time. Thus the algorithm is scalable as N increases.

**0.3.4 Weighted Quick Unions with path compression
**

The root finding operation is performed during find as well as union operation. Let p be element of the tree .To find the root element we traverse through the parents of element p. Once we find the root we directly assign the parent of the element p as the root node.Thus any further operations will lead reduced accesses cost. Along with this every node that has been traversed can be assigned the root node as the parent. This is constant extra cost .Once the tree is traversed to find the root and the tree is traversed again to assing root element of each of the parent nodes. Another way to reduce the cost is to reduce the length by 2 by assigning every node in the path to its grandparents. The runnning time of weighted quick union with path compression is near linear time any sequence of M union c(N + M log £ N ) array accesses. The function log £ N is iterative log function is number of times log is taken to get value 1.

0.4 Code

Code for quick find and quick union is available in repository https: //github.com/pi19404/m19404/tree/master/Algorithm/UnionFind/

9|9