# An Introduction of Union-Find Disjoint Sets

## City Connections

Let us consider a scenario of building cities with roads. A map of city can be regarded as a bunch of junctions of roads, the question is how we can figure out if I can get from a junction to another junction in a efficient time. To solve this problem, we can adopt Union-find Disjoint Sets (or disjoint sets).

## Reading in Roads Info

Since we are curious about how the roads in the map are connected. Let we mock up some data, let say:

```
1 -- 3
1 -- 2
2 -- 4
3 -- 4
5 -- 6
6 -- 7
```

The notations `--`

indicate a interconnection of each pair of junctions. We can build a graph according to the specification, like this:

In this graph, if we want to check if node 4 is connected with node 7, we can use search algorithms such as Depth-first Search (DFS) or Breadth-first Search (BFS). If we want to make sure we cannot go to node 7 from node 4, we must traverse all around the left 1-2-4-3-1 graph. The time complexity of checking is O(N).

If we just want to know if they are interconnected, the more time-saving way is to build disjoint sets, then we can lower down the time complexity with the help of trees.

## How it works

To model this problem, we can regard each traffice junctions as a node, as shown in the figure. At first, we can initiate each node having their parent node as itself since disjoint sets work with tree data structures (tree have a root node). In the current scenario, every node is a root node and there are 7 trees in this figure, which can easily illustrated by the Python code:

```
parents = [0, 1, 2, 3, 4, 5, 6, 7] # index 0 omitted
```

Now we read in the data of how the junctions connected. Such as `1 -- 3`

, we need to change the parent of one node to another (whatever it is node 1 or node 3). The diagram illustrate the result of disjoint sets after alternation.

```
parents = [0, 3, 2, 3, 4, 5, 6, 7]
```

Okay, seems easy. Let us do the next link, `1 -- 2`

. Follow with the rules of tree, a node in a tree can only have one parent. If we change the parent of node 1 to node 2, node 3 could no longer be parent of node 1, which causes loss of information. The solution is pretty straight-forward, since each root node have parent node as themselves. Since each group of interconnected nodes can be identified by different root nodes, if node 2 want to join the group which having root node is node 3, then just simply change the parent of node 3 (group root node) to node 2.

Here are the rules:

- If we are going to link two nodes which are in different groups (single node is also considered as a group), link the root nodes of each groups together.
- If we are going to link two nodes which are in the same group (having the same group root node), do nothing.

The Python code for finding a group's root node is:

```
def parent_of(node):
if parents[node] is not node:
return parent_of(parents[node])
else:
return node
```

By applying the rule, we can finally get following trees after linked `6 -- 7`

:

In the final result, `6 -- 7`

, we have two groups, root node 4 and root node 7. Back to the question, to check if node 4 and node 7 are interconnected, we need to check if `parent_of(4)`

is equal to `parent_of(7)`

.

However, currently, the worse case of disjoint sets is that trees are builded as a linked list (degenerate into a chain), as the example given. The time complexity is still similar to that of utilizing search algorithms. To tackle the issue, we must update the current disjoint set into a 1-depth tree, as illustrated in the following figure.

The following updated `parent_of(node)`

function can implement this action

```
def parent_of(node):
if parents[node] is not node:
parents[node] = parent_of(parents[node]) # update parent of node to root
return parents[node]
else:
return node
```

When we get disjoint sets after compression of path, the time complexity of checking if two nodes are interconnected will be optimized.