Imagine you’re given a set of nodes and edges, and you want to determine if they form a valid tree. A valid tree in graph theory has two key properties.
- It is connected–every node can be reached from any other node.
- It has no cycles–there’s exactly one path between any two nodes.
In simple terms, a valid tree is like a family tree: everyone is connected and there are no loops.
This problem is common in interviews as it combines graph traversal, cycle detection, and connectivity checks.
DFS (Depth-First Search) and Union-Find are two common approaches for solving graph problems like determining connectivity and detecting cycles. Each method has its strengths and weaknesses, depending on the problem context.
Depth First Search
Advantages:
1. Simplicity:
• DFS is conceptually simple and easy to implement using recursion or a stack.
• Suitable for problems that require explicit exploration of graph paths (e.g., connected components).
2. Edge-Based Exploration:
• DFS is great for problems that involve exploring relationships between nodes, such as finding cycles or paths.
3. No Extra Data Structures:
• Uses only a visited array, making it straightforward and memory-efficient for sparse graphs.
Disadvantages:
1. Limited to Small Graphs:
• Recursive DFS can lead to stack overflow for large graphs due to its depth-first nature.
2. Inefficient for Dynamic Problems:
• DFS doesn’t handle dynamic updates (e.g., adding/removing edges) efficiently.
• Every update may require a fresh traversal of the graph.
3. Revisiting Nodes:
• For dense graphs, DFS may revisit many nodes and edges, leading to inefficiency.
public boolean validTree(int n, int[][] edges) {
if (edges.length != n - 1) return false; // Quick check for tree property
// Build adjacency list
List<List<Integer>> graph = new ArrayList<>();
for (int i = 0; i < n; i++) graph.add(new ArrayList<>());
for (int[] edge : edges) {
graph.get(edge[0]).add(edge[1]);
graph.get(edge[1]).add(edge[0]);
}
// DFS to check connectivity
boolean[] visited = new boolean[n];
dfs(0, -1, graph, visited);
// Ensure all nodes are visited
for (boolean v : visited) {
if (!v) return false;
}
return true;
}
private void dfs(int node, int parent, List<List<Integer>> graph, boolean[] visited) {
visited[node] = true;
for (int neighbor : graph.get(node)) {
if (!visited[neighbor]) {
dfs(neighbor, node, graph, visited);
} else if (neighbor != parent) { // Cycle detection
throw new IllegalStateException("Cycle detected");
}
}
}
Line-By-Line Explanation
Step1: Check if the Number of Edges is n – 1
if (edges.length != n - 1) return false; // Quick check for tree property
- A valid tree with n nodes must have exactly n – 1 edges. If this condition isn’t met, the graph cannot be a tree.
Step 2: Build an Adjacency List
List<List<Integer>> graph = new ArrayList<>();
for (int i = 0; i < n; i++) graph.add(new ArrayList<>());
for (int[] edge : edges) {
graph.get(edge[0]).add(edge[1]);
graph.get(edge[1]).add(edge[0]);
}
• Adjacency lists are efficient for traversal and are often preferred over adjacency matrices for sparse graphs (graphs with fewer edges).
Step 3: Perform DFS to Check Connectivity
boolean[] visited = new boolean[n];
dfs(0, -1, graph, visited);
• Initializes a visited array to track whether each node has been visited.
• Starts a Depth-First Search (DFS) traversal from node 0.
Step 4: Ensure All Nodes Are Visited
for (boolean v : visited) {
if (!v) return false;
}
• After DFS completes, checks if all nodes have been visited.
• If any node is unvisited, the graph is not connected, and thus not a tree.
Step 5: Mark nodes as visited (dfs)
visited[node] = true;
- Marks the current node as visited to avoid revisiting it.
Step 6: Iterate Through Neighbors
for (int neighbor : graph.get(node)) {
• Loops through all neighbors of the current node using the adjacency list.
Step 7: Recursively Visit Unvisited Neighbors
if (!visited[neighbor]) {
dfs(neighbor, node, graph, visited);
}
• If a neighbor hasn’t been visited, recursively calls DFS on it.
Step 8. Check for Cycles
else if (neighbor != parent) { // Cycle detection
throw new IllegalStateException("Cycle detected");
}
• Detects cycles, which disqualify the graph from being a tree.
Union Find
Advantages:
1. Efficient Cycle Detection:
• With path compression and union by rank, Union-Find can detect cycles in O(\alpha(V)), where \alpha is the inverse Ackermann function.
2. Dynamic Edge Addition:
• Handles edge additions efficiently, making it ideal for dynamic connectivity problems (e.g., adding roads or connections).
3. Compact Representation:
• Uses arrays to represent sets, leading to efficient operations with minimal overhead.
4. Global Connectivity Check:
• Quickly determines if two nodes are in the same connected component without needing a full traversal.
Disadvantages:
1. Initialization Overhead:
• Requires explicit initialization of parent and rank arrays, which adds complexity.
2. Limited Use Cases:
• Primarily suited for disjoint set-related problems like cycle detection and connectivity.
• Not ideal for problems requiring exploration of paths or edge weights.
3. Lacks Explicit Paths:
• Union-Find doesn’t explicitly store paths between nodes, making it unsuitable for pathfinding.
public class GraphValidTree {
public boolean validTree(int n, int[][] edges) {
if (edges.length != n - 1) return false; // A tree must have n-1 edges
UnionFind uf = new UnionFind(n);
for (int[] edge : edges) {
if (!uf.union(edge[0], edge[1])) return false; // Cycle detected
}
return uf.getCount() == 1; // Check if all nodes are in the same component
}
}
class UnionFind {
private int[] parent;
private int[] rank;
private int count;
public UnionFind(int size) {
parent = new int[size];
rank = new int[size];
count = size;
for (int i = 0; i < size; i++) {
parent[i] = i;
}
}
public int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]); // Path compression
}
return parent[x];
}
public boolean union(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX == rootY) return false; // Cycle detected
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
count--;
return true;
}
public int getCount() {
return count;
}
}
Line-By-Line Explanation
Step 1: Check Edge Count
if (edges.length != n - 1) return false; // A tree must have n-1 edges
• A valid tree with n nodes must have exactly n-1 edges. If this condition is not met, it immediately returns false.
Step 3: Initialize Union-Find
UnionFind uf = new UnionFind(n);
• Creates a Union-Find data structure for n nodes. This data structure will manage connected components.
Step 4: Process Each Edge
for (int[] edge : edges) {
if (!uf.union(edge[0], edge[1])) return false; // Cycle detected
}
• Iterates through all edges and performs a union operation on the nodes of each edge.
• If union returns false, it means a cycle was detected, and the function immediately returns false.
Step 5: Check Connected Componenets
return uf.getCount() == 1; // Check if all nodes are in the same component
• Calls getCount() to ensure there is only one connected component, meaning all nodes are part of the same tree.
Step 6: UnionFind Class & Constructor
public UnionFind(int size) {
parent = new int[size];
rank = new int[size];
count = size;
for (int i = 0; i < size; i++) {
parent[i] = i;
}
}
• Initializes the parent array (each node is its own parent initially), the rank array (used for union by rank), and the count of components.
Step 7: find Operation
public int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]); // Path compression
}
return parent[x];
}
• Recursively finds the root of the set containing x.
• Performs path compression, which flattens the structure for future queries.
Step 8: Union Operation
public boolean union(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX == rootY) return false; // Cycle detected
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
count--;
return true;
}
• Finds the roots of x and y. If the roots are the same, a cycle is detected, and it returns false.
• Otherwise, it merges the smaller tree into the larger tree (union by rank) and decreases the component count.
Step 9: GetCount Operation
public int getCount() {
return count;
}
• Returns the number of connected components.
Why This Code Works
DFS and Union-Find are powerful tools for graph problems, but they serve different purposes. DFS excels in path exploration and explicit traversal, while Union-Find is ideal for efficient cycle detection and connectivity checks, especially in dynamic or large graphs. This code leverages Union-Find with path compression and union by rank, ensuring optimal performance by keeping operations nearly constant time. By merging connected components and efficiently managing disjoint sets, this solution provides a clean, scalable approach to validate trees. It highlights how algorithmic patterns can be tailored to specific problem constraints for maximum efficiency.