Skip to main content

Preview — Pro guide

You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.

DSA·Intermediate

Union-Find (DSU): Dynamic Connectivity in Near-Constant Time

Master Disjoint Set Union with path compression and union by rank for O(α(n)) per operation. Covers connected components, cycle detection, accounts merge, and when DSU outperforms BFS/DFS for graph problems.

30 min read 3 sections 1 interview questions
Union FindDisjoint Set UnionDSUPath CompressionUnion by RankConnected ComponentsCycle DetectionInverse AckermannDynamic ConnectivityKruskal MSTAccounts MergeGrid Connectivity

What Union-Find Solves — Dynamic Connectivity

Union-Find (Disjoint Set Union, DSU) answers one question efficiently: are nodes X and Y in the same connected component? It supports two operations:

  • union(x, y) — merge the components containing x and y
  • find(x) — return the root representative of x's component
  • connected(x, y) — returns find(x) == find(y)

The key word is dynamic: union-find answers connectivity queries as edges are added incrementally, without rebuilding the data structure. This is the use case where union-find beats BFS/DFS — when the graph is constructed edge-by-edge and you need to answer connectivity queries between edge additions.

Without optimizations (naive implementation where every node points to parent, no compression): find walks up the parent chain, O(n) in the degenerate case where the tree becomes a long chain. union uses a find, so also O(n).

With both optimizations (path compression + union by rank): O(α(n)) amortized per operation, where α is the inverse Ackermann function. For all practical n (including n = 10^80, the number of atoms in the observable universe), α(n) ≤ 4. Effectively O(1) amortized.

This near-constant performance across n up to 10^80 makes union-find one of the most practically efficient data structures in computer science. Kruskal's MST algorithm, network connectivity analysis, and real-time multiplayer game region detection all rely on it.

When to Use Union-Find — Decision Framework

01

Check if the graph is undirected and dynamic

Union-Find only models undirected connectivity. For directed graphs, use DFS with coloring. Dynamic means edges are added incrementally — if you have all edges upfront and need one-time connectivity, simple BFS/DFS is simpler.

02

Identify what gets 'unioned'

The 'nodes' don't have to be integers. For strings (emails, names), build an email_to_id mapping first. For grid cells (row, col), convert to linear index: r * cols + c. For variable names, assign monotonic integer IDs.

03

Initialize with the right size

If nodes are 1-indexed (LeetCode common), initialize UnionFind(n+1) to avoid index-0 confusion. If nodes are 0-indexed, UnionFind(n). Set num_components = n initially, decrement by 1 per successful union.

04

Always implement both optimizations

Path compression: one line in find() — if parent[x] != x: parent[x] = find(parent[x]). Union by rank: compare ranks, attach smaller under larger. Without both: O(log n) or worse instead of O(α(n)).

05

Use union's return value for cycle detection

union(u, v) returns False when u and v are already connected. This is cycle detection for free: the first edge where union returns False is the redundant edge. No separate cycle-detection code needed.

IMPORTANT

Premium content locked

This guide is premium content. Upgrade to Pro to unlock the full guide, quizzes, and interview Q&A.