The LeetCode problem Find Duplicate Number requires identifying a duplicate in an array of n + 1 integers, where each integer is in the range 1 to n. Only one duplicate number is guaranteed, and the problem must be solved without modifying the array, in constant space, and with a runtime better than O(n^2).
This problem has multiple solutions, but we aim to highlight the most efficient one using Floyd’s Cycle Detection Algorithm (also known as the Tortoise and Hare method). This clever algorithm leverages properties of numbers and indices to detect cycles in linked-list-like data structures.
Brute Force Approach
public class FindDuplicate {
public int findDuplicate(int[] nums) {
for (int i = 0; i < nums.length; i++) {
for (int j = i + 1; j < nums.length; j++) {
if (nums[i] == nums[j]) {
return nums[i];
}
}
}
return -1; // Shouldn't reach here
}
}
• The outer loop iterates through every element.
• The inner loop checks if the current element matches any subsequent element.
• Returns the duplicate once found.
Optimal Solution: Floyd’s Cycle Detection
The numbers in the array represent “pointers” to indices. Using this property, the problem is equivalent to detecting a cycle in a linked list. The duplicate number is the “start” of the cycle.
1. Use two pointers (Tortoise and Hare) to traverse the array:
• Tortoise moves one step at a time.
• Hare moves two steps at a time.
2. Detect the cycle:
• If the Tortoise and Hare meet, a cycle exists.
3. Find the duplicate:
• Reset one pointer to the start of the array and move both pointers one step at a time. The point where they meet is the duplicate.
public class FindDuplicate {
public int findDuplicate(int[] nums) {
// Step 1: Detect the cycle using Floyd's Tortoise and Hare
int tortoise = nums[0];
int hare = nums[0];
do {
tortoise = nums[tortoise];
hare = nums[nums[hare]];
} while (tortoise != hare);
// Step 2: Find the entry point of the cycle (duplicate number)
tortoise = nums[0];
while (tortoise != hare) {
tortoise = nums[tortoise];
hare = nums[hare];
}
return hare;
}
}
Line-By-Line Explanation
Step 1: Detect the Cycle
int tortoise = nums[0];
int hare = nums[0];
- Initializes two pointers, tortoise and hare, both starting at the first index of the array.
- These pointers are used to detect a cycle in the array, treating indices and values as a graph-like structure.
Step 2: Move Fast and Slow Pointers
do {
tortoise = nums[tortoise]; // Tortoise moves one step
hare = nums[nums[hare]]; // Hare moves two steps
} while (tortoise != hare);
• Moves the tortoise pointer one step at a time: nums[tortoise].
• Moves the hare pointer two steps at a time: nums[nums[hare]].
• Continues looping until the two pointers meet, which indicates a cycle.
Step 3: Find the Cycle Entry Point (Duplicate Number)
tortoise = nums[0];
- After detecting a cycle, the meeting point of tortoise and hare is not necessarily the start of the cycle. This step prepares to find the exact starting point of the cycle, which corresponds to the duplicate number.
Step 4: Move Both Pointers
while (tortoise != hare) {
tortoise = nums[tortoise];
hare = nums[hare];
}
- Moves both tortoise and hare one step at a time.
- Continues until the two pointers meet, which is guaranteed to happen at the duplicate number.
- The first meeting of the two pointers (tortoise and hare) after resetting identifies the start of the cycle, which is the duplicate number in this problem.
Step 5: Return
return hare;
- Returns the value at the meeting point, which is the duplicate number.
Why This Code Works
This algorithm works because it treats the array as a graph where each index points to another index based on the value at that index, forming a directed graph. Floyd’s Cycle Detection Algorithm efficiently detects the cycle (caused by the duplicate) by using fast and slow pointers, leveraging the mathematical properties of cycle traversal.
This algorithm is the best for this problem because:
1. Time Complexity: It runs in O(n), making it extremely efficient for large arrays.
2. Space Complexity: It uses O(1) space, as no additional data structures are required.
3. Applicability: The problem guarantees a cycle due to the duplicate, aligning perfectly with Floyd’s cycle detection.
By combining fast and slow pointers to detect the cycle and then using two pointers to find the entry point, this algorithm ensures both clarity and optimal performance, following LeetCode best practices.