56. Merge Intervals

Medium 14,934 549 Accepted: 1,543,066 Submitted: 3,400,106 Acceptance Rate: 45.38%

Given an array of intervals where intervals[i] = [starti, endi], merge all overlapping intervals, and return an array of the non-overlapping intervals that cover all the intervals in the input.

Analysis

1. What is the min and max amount of intervals in the list?
2. What is the range for start and end values?
3. What should happen when there are no overlapping intervals?
4. Are repeated intervals possible and how should we handle that?
5. Are the intervals sorted in any manner?

Design

1. The brute force solution would be to iterate through each interval and then look at every other non-merged intervals and merge if overlapping. There are more details to account for with this solution if we go this route but it feels like we can easily do better.
2. In the brute force solution, we have to loop through all the intervals to determine which ones to merge because any of them could be a viable candidate to merge. One way to cut down the work is if we sort the intervals. This way, the potential merging intervals should all be nearby or after the current interval.

How should we sort the interval? We have two options, start or the end. Both could potentially work for this but if we sort by end and multiple intervals have the same end, we will need to use start as the tiebreaker to make sure we get the intervals in order. In that case, why not just go with start? Do we need a tiebreaker with start? No, because the two intervals with the same start will end up merging anyway.

How should the merging be done? We can have an output list and append intervals to it once we know an interval can't be merged anymore. We know an interval can't be merged anymore if the next interval's start time is greater than the current one's end time. We can keep track of the current earliest start and latest end in variables.

3. Can we do better than a sorted solution? It doesn't seem possible since we can't figure out which intervals are viable linearly without sorting.

Code

sorted_intervals = sorted(intervals, key=lambda x: x[0], reverse=False)


curr_earliest_start, curr_latest_end = sorted_intervals[0][0], sorted_intervals[0][1]
idx = 1
results = []
while idx < len(sorted_intervals):
if sorted_intervals[idx][0] <= curr_latest_end:
#overlaps
curr_latest_end = max(curr_latest_end, sorted_intervals[idx][1])
else:
results.append([curr_earliest_start, curr_latest_end])
curr_earliest_start, curr_latest_end = sorted_intervals[idx][0], sorted_intervals[idx][1]

idx += 1

# there should one leftover that we need to flush
results.append([curr_earliest_start, curr_latest_end])
return results

Test

1. All intervals merged
2. No overlapping intervals
3. Some overlapping and some not
4. Intervals with same times
5. Intervals with same start/end times and variations of the other in different orders

Complexity Analysis

n = number of intervals
The sort is O(n log n). The linear loop is O(n).
O(n log n)

Post-Solving Commentary

1. The LeetCode solution mentioned treating this as a graph connected component problem. This is similar to the design I mentioned but I didn't consider it as a graph.This shows a different way of thinking when it comes to interval problems.
2. An interesting option sorting the starts and ends separately.
3. Another option using a binary search tree.