Imagine you are continuously receiving numbers in a stream, and you want to keep track of the Kth largest element at any given time. This means if you are given a number k, you need to find the k-th largest element among all the numbers you have received so far.
This problem is commonly encountered in scenarios like monitoring live data (e.g., stock prices or server logs), where you need to keep track of the largest values in a dynamic stream of data.
Brute Force Approach
import java.util.ArrayList;
import java.util.Collections;
class KthLargest {
private int k;
private ArrayList<Integer> stream;
public KthLargest(int k, int[] nums) {
this.k = k;
this.stream = new ArrayList<>();
for (int num : nums) {
stream.add(num);
}
}
public int add(int val) {
stream.add(val); // Add the new value to the stream
Collections.sort(stream, Collections.reverseOrder()); // Sort in descending order
return stream.get(k - 1); // Return the kth largest element
}
}
- The constructor initializes the object with k and adds all numbers in the given array to a list.
- Each time a new value is added:
- The value is appended to the list.
- The list is sorted in descending order.
- The k-th largest element is retrieved by accessing the k – 1 index.
Optimal Approach
The goal is simple: continuously find the k-th largest element in a stream of numbers. As new numbers arrive, the k-th largest element should update dynamically without needing to reprocess the entire dataset.
Think of scenarios like:
• Tracking the k-th most popular item in a live e-commerce dashboard.
• Monitoring the k-th highest stock price in real-time.
• Keeping tabs on server response times in a live analytics tool.
The key challenge? We need to handle a potentially infinite stream of numbers efficiently. Sorting the entire dataset every time a new number arrives isn’t feasible—it’s too slow and memory-intensive.
1. Min-Heap:
• Keeps the smallest element at the root.
• Efficiently maintains the k-th largest element with constant-time retrieval (O(1)) and logarithmic-time updates (O(log k)).
2. Greedy Approach:
• Only the top k largest elements are kept in the heap, ensuring the algorithm processes values dynamically with minimal storage.
class KthLargest {
private PriorityQueue<Integer> minHeap;
private int k;
public KthLargest(int k, int[] nums) {
this.k = k;
this.minHeap = new PriorityQueue<>(); // Min-heap to store the top k largest elements
for (int num : nums) {
add(num); // Add elements using the same logic as the add function
}
}
public int add(int val) {
if (minHeap.size() < k) {
minHeap.offer(val); // Add directly if heap has fewer than k elements
} else if (val > minHeap.peek()) {
minHeap.poll(); // Remove the smallest element
minHeap.offer(val); // Add the new value
}
return minHeap.peek(); // Return the kth largest element (root of the heap)
}
}
Step-By-Step Explanation
- Initialize State
private PriorityQueue<Integer> minHeap;
private int k;
- minHeap: A min-heap (priority queue) to store the largest k elements from the data stream.
- k: The k-th largest element to track.
2. Constructor
public KthLargest(int k, int[] nums) {
this.k = k;
this.minHeap = new PriorityQueue<>();
for (int num : nums) {
add(num); // Reuse the add function to populate the heap
}
}
• Initializes k with the required size.
• Creates a new min-heap.
• Iterates over the initial array nums, adding each element to the heap using the add method.
Step 3: add
public int add(int val) {
if (minHeap.size() < k) {
minHeap.offer(val); // Add directly if heap has fewer than k elements
} else if (val > minHeap.peek()) {
minHeap.poll(); // Remove the smallest element
minHeap.offer(val); // Add the new value
}
return minHeap.peek(); // Return the kth largest element (root of the heap)
}
- If the head contains fewer than k elements, the new value is added directly.
- If heap already contains k elements, compare the new value to the smallest.
- If val is larger, the smallest element is removed, and the new value is added.
- Return minHeap.peek().
Conclusion
This min-heap-based solution is the gold standard for solving the k-th largest element problem in a stream. It combines simplicity, efficiency, and scalability, making it perfect for real-world scenarios. The beauty of this approach lies in its ability to focus only on the most relevant data (the top k elements) while discarding the rest. By understanding the intuition behind this algorithm, you can appreciate its elegance and power in handling complex streaming problems effortlessly.