1.. SPDX-License-Identifier: GPL-2.0 2 3==================== 4Union-Find in Linux 5==================== 6 7 8:Date: June 21, 2024 9:Author: Xavier <xavier_qy@163.com> 10 11What is union-find, and what is it used for? 12------------------------------------------------ 13 14Union-find is a data structure used to handle the merging and querying 15of disjoint sets. The primary operations supported by union-find are: 16 17 Initialization: Resetting each element as an individual set, with 18 each set's initial parent node pointing to itself. 19 20 Find: Determine which set a particular element belongs to, usually by 21 returning a “representative element” of that set. This operation 22 is used to check if two elements are in the same set. 23 24 Union: Merge two sets into one. 25 26As a data structure used to maintain sets (groups), union-find is commonly 27utilized to solve problems related to offline queries, dynamic connectivity, 28and graph theory. It is also a key component in Kruskal's algorithm for 29computing the minimum spanning tree, which is crucial in scenarios like 30network routing. Consequently, union-find is widely referenced. Additionally, 31union-find has applications in symbolic computation, register allocation, 32and more. 33 34Space Complexity: O(n), where n is the number of nodes. 35 36Time Complexity: Using path compression can reduce the time complexity of 37the find operation, and using union by rank can reduce the time complexity 38of the union operation. These optimizations reduce the average time 39complexity of each find and union operation to O(α(n)), where α(n) is the 40inverse Ackermann function. This can be roughly considered a constant time 41complexity for practical purposes. 42 43This document covers use of the Linux union-find implementation. For more 44information on the nature and implementation of union-find, see: 45 46 Wikipedia entry on union-find 47 https://en.wikipedia.org/wiki/Disjoint-set_data_structure 48 49Linux implementation of union-find 50----------------------------------- 51 52Linux's union-find implementation resides in the file "lib/union_find.c". 53To use it, "#include <linux/union_find.h>". 54 55The union-find data structure is defined as follows:: 56 57 struct uf_node { 58 struct uf_node *parent; 59 unsigned int rank; 60 }; 61 62In this structure, parent points to the parent node of the current node. 63The rank field represents the height of the current tree. During a union 64operation, the tree with the smaller rank is attached under the tree with the 65larger rank to maintain balance. 66 67Initializing union-find 68----------------------- 69 70You can complete the initialization using either static or initialization 71interface. Initialize the parent pointer to point to itself and set the rank 72to 0. 73Example:: 74 75 struct uf_node my_node = UF_INIT_NODE(my_node); 76 77or 78 79 uf_node_init(&my_node); 80 81Find the Root Node of union-find 82-------------------------------- 83 84This operation is mainly used to determine whether two nodes belong to the same 85set in the union-find. If they have the same root, they are in the same set. 86During the find operation, path compression is performed to improve the 87efficiency of subsequent find operations. 88Example:: 89 90 int connected; 91 struct uf_node *root1 = uf_find(&node_1); 92 struct uf_node *root2 = uf_find(&node_2); 93 if (root1 == root2) 94 connected = 1; 95 else 96 connected = 0; 97 98Union Two Sets in union-find 99---------------------------- 100 101To union two sets in the union-find, you first find their respective root nodes 102and then link the smaller node to the larger node based on the rank of the root 103nodes. 104Example:: 105 106 uf_union(&node_1, &node_2); 107