Median Two Sorted Arrays – 4. LeetCode

Imagine you have two lists of numbers that are already sorted in order. You want to combine them and find the middle number (or the average of the two middle numbers if the total count is even). This middle value is called the “median”. The challenge is to do this efficiently without fully merging the two lists.

Brute Force Approach

import java.util.ArrayList;
import java.util.Collections;

public class MedianOfTwoSortedArrays {
    public static double findMedianSortedArrays(int[] nums1, int[] nums2) {
        ArrayList<Integer> merged = new ArrayList<>();
        for (int num : nums1) merged.add(num);
        for (int num : nums2) merged.add(num);
        Collections.sort(merged); // Sort the combined array
        
        int n = merged.size();
        if (n % 2 == 1) {
            return merged.get(n / 2); // Odd: Return the middle element
        } else {
            return (merged.get(n / 2 - 1) + merged.get(n / 2)) / 2.0; // Even: Average two middle elements
        }
    }
}

1. Combine both arrays into a single list.

2. Sort the combined list to maintain order.

3. If the size of the list is odd, return the middle element. If even, return the average of the two middle elements.

This approach works but is inefficient because sorting the merged array takes O((m+n)\log(m+n)), where m and n are the sizes of the two input arrays.

Optimal Approach

Algorithm & Data Structure

Algorithm: Binary Search on partitions

Data Structure: Minimal use of extra space (arrays)

Intuition:

Instead of merging arrays, focus on finding the partition point in both arrays where the elements to the left are smaller than those on the right. Use binary search to efficiently locate this partition. The sorted nature of the arrays makes this possible.

Key characteristics of the problem:

• Both arrays are sorted.

• The median can be determined by splitting the arrays into two parts: left (smaller half) and right (larger half).

• Binary search is well-suited for problems involving sorted arrays and partitioning.

What is a partition?

A partition in this algorithm refers to the act of dividing one array (e.g., nums1) into two segments:

  • Left partition: Contains elements that are considered smaller or equal to the median.
  • Right partition: Contains elements that are considered larger or equal to the median.

Similarly, the other array (nums2) is implicitly divided into two segments based on the partition in the first array. This ensures that the combined dataset (from both arrays) is also split into two halves:

  • The left half contains the smaller elements from both arrays.
  • The right half contains the larger elements from both arrays.

Why Do We Partition?

  1. To Locate the Median Efficiently
    • The median divides the combined dataset into two halves of equal size (or nearly equal size if the total number of elements is odd).
    • By finding the correct partition, we can compute the median directly from the boundary values without merging the arrays.
  2. To Leverage Sorted Order:
    • The arrays are already sorted, so the partition ensures that elements in the left partition are smaller than those in the right partition. This eliminates the need to compare all elements.
  3. To Avoid Full Merging:
    • Merging two sorted arrays into a single sorted array would require O(m + n) time and space. Partitioning avoids this by only considering boundary elements, drastically reducing the computation effort.

Steps:

1. Ensure the first array is smaller than the second.

2. Perform binary search on the smaller array to find the partition point.

3. Check if the partition divides the combined data correctly:

• The largest element on the left side is smaller than the smallest element on the right.

4. Calculate the median:

• If total size is odd, it’s the maximum of the left partition.

• If even, it’s the average of the maximum of the left and minimum of the right.

public class MedianOfTwoSortedArrays {
    public double findMedianSortedArrays(int[] nums1, int[] nums2) {
        // Ensure nums1 is the smaller array
        if (nums1.length > nums2.length) {
            return findMedianSortedArrays(nums2, nums1);
        }

        int x = nums1.length;
        int y = nums2.length;
        int low = 0, high = x;

        while (low <= high) {
            int partitionX = (low + high) / 2;
            int partitionY = (x + y + 1) / 2 - partitionX;

            // Handle edge cases for partition
            int maxX = (partitionX == 0) ? Integer.MIN_VALUE : nums1[partitionX - 1];
            int minX = (partitionX == x) ? Integer.MAX_VALUE : nums1[partitionX];

            int maxY = (partitionY == 0) ? Integer.MIN_VALUE : nums2[partitionY - 1];
            int minY = (partitionY == y) ? Integer.MAX_VALUE : nums2[partitionY];

            if (maxX <= minY && maxY <= minX) {
                // Found the correct partition
                if ((x + y) % 2 == 0) {
                    return (Math.max(maxX, maxY) + Math.min(minX, minY)) / 2.0;
                } else {
                    return Math.max(maxX, maxY);
                }
            } else if (maxX > minY) {
                high = partitionX - 1; // Move left
            } else {
                low = partitionX + 1; // Move right
            }
        }

        throw new IllegalArgumentException("Input arrays are not sorted.");
    }
}

Line By Line Explanation

Step 1: Ensure nums1 is smaller array

public class MedianOfTwoSortedArrays {
    public double findMedianSortedArrays(int[] nums1, int[] nums2) {
        // Step 1: Ensure nums1 is the smaller array
        if (nums1.length > nums2.length) {
            return findMedianSortedArrays(nums2, nums1);
        }
  • Check nums1 is longer than nums2. If so, it swaps the arrays by calling the function again with the arrays switched.
  • The binary search will be performed on nums1. Ensuring it’s the smaller array minimizes the number of steps needed in the binary search, making it more efficient.

Step 2: Initialize Variables

        int x = nums1.length;
        int y = nums2.length;
        int low = 0, high = x;
  • x any y store the lengths of nums1 and nums2.
  • low and high are the search bounds for the binary search on nums1.
  • The variables help define the range for binary search (low to high) and partition calculations.

Step 3: Binary Search Loop

        while (low <= high) {
            int partitionX = (low + high) / 2;
            int partitionY = (x + y + 1) / 2 - partitionX;
  • Calculates partitionX, which divides nums1 into left and right parts.
  • Calculates partitionY, which divides nums2 into left and right parts based on partitionX.
  • The goal is to divide both arrays into two halves such that all elements on the left are smaller than or equal to all elements on the right.

Step 4: Handle Edges Cases For Partitions

            int maxX = (partitionX == 0) ? Integer.MIN_VALUE : nums1[partitionX - 1]; //Check if the partition in nums1 is at the very beginning
            int minX = (partitionX == x) ? Integer.MAX_VALUE : nums1[partitionX]; //Check if the partition in nums1 is at the very end.

            int maxY = (partitionY == 0) ? Integer.MIN_VALUE : nums2[partitionY - 1]; //Check if the partition in nums2 is at the very beginning.
            int minY = (partitionY == y) ? Integer.MAX_VALUE : nums2[partitionY]; //Check if the partition in nums2 is a the very beginning.
  • Assigns maxX and minX to represent the largest and smallest values around the partition in nums1.
  • Similarly, assigns maxY and minY for nums2.
  • Uses Integer.MIN_VALUE and Integer.MAX_VALUE for cases where the partition is at the boundary of the array (no elements on one side).

Step 5: Check if the partition is valid

            if (maxX <= minY && maxY <= minX) {
  • Confirms whether the largest element on the left side of the partition (maxX or maxY) is less than or equal to the smallest element on the right side (minX or minY).
  • This condition ensures the arrays are correctly divided into two halves, making it possible to calculate the median.

Step 6: Calculate the median

                if ((x + y) % 2 == 0) {
                    return (Math.max(maxX, maxY) + Math.min(minX, minY)) / 2.0;
                } else {
                    return Math.max(maxX, maxY);
                }
  • If the total number of elements is even, the median is the average of the largest element on the left and the smallest element on the right.
  • If the total number of elements is odd, the median is the largest element on the left.
  • This calculates the median directly from the partitioned halves.

Step 7: Adjust binary search bounds

            } else if (maxX > minY) {
                high = partitionX - 1; // Move left
            } else {
                low = partitionX + 1; // Move right
            }
  • If maxX > minY, the partition is too far to the right, so move the high pointer to the left.
  • Otherwise, move the low pointer to the right.
  • Adjusts the binary search range to find the correct partition.

Why This Code Words


his code efficiently finds the median of two sorted arrays by using binary search on one of the arrays to locate a partition that divides both arrays into two halves. It ensures that all elements in the left half are smaller than or equal to all elements in the right half by comparing boundary values at the partition points. By leveraging the sorted nature of the arrays and avoiding full merging, the algorithm achieves an optimal time complexity of O(\log(\min(m, n))) and minimal space usage, making it both elegant and efficient.

Leave a Reply

Your email address will not be published. Required fields are marked *