Finding (number of) overlaps in a list of time ranges_问答_开发者

Given a list of time ranges, I need to find the maximum number of overlaps.

Following is a dataset showing a 10 minute interval of calls, from which I am trying to find the maximum number of active lines in that interval. ie. from the example below, what is the maximum number of calls that were active at the same time:

CallStart   CallEnd
2:22:22 PM  2:22:33 PM
2:22:35 PM  2:22:42 PM
2:22:36 PM  2:22:43 PM
2:22:46 PM  2:22:54 PM
2:22:49 PM  2:27:21 PM
2:22:57 PM  2:23:03 PM
2:23:29 PM  2:23:40 PM
2:24:08 PM  2:24:14 PM
2:27:37 PM  2:39:14 PM
2:27:47 PM  2:27:55 PM
2:29:04 PM  2:29:26 PM
2:29:31 PM  2:29:43 PM
2:29:45 PM  2:30:10 PM

If anyone knows an alogrithm or can point me in the right direction, I would be grateful.

TIA,

Steve F

Following must work:

Sort all your time values and save Start or End state for each time value.
Set numberOfCalls to 0 (count variable)
Run through your time values and:
- increment numberOfCalls if time value marked as Start
- decrement numberOfCalls if time value marked as End
- keep track of maximum value of numberOfCalls during the process (and time values when it occurs)

Complexity: O(n log(n)) for sorting, O(n) to run through all records

Here is a working algorithm in Python

def maximumOverlap(calls):
    times = []
    for call in calls:
        startTime, endTime = call
        times.append((startTime, 'start'))
        times.append((endTime, 'end'))
    times = sorted(times)

    count = 0
    maxCount = 0
    for time in times:
        if time[1] == 'start':
            count += 1    # increment on arrival/start
        else:
            count -= 1    # decrement on departure/end
        maxCount = max(count, maxCount)  # maintain maximum
    return maxCount

calls = [
('2:22:22 PM', '2:22:33 PM'),
('2:22:35 PM', '2:22:42 PM'),
('2:22:36 PM', '2:22:43 PM'),
('2:22:46 PM', '2:22:54 PM'),
('2:22:49 PM', '2:27:21 PM'),
('2:22:57 PM', '2:23:03 PM'),
('2:23:29 PM', '2:23:40 PM'),
('2:24:08 PM', '2:24:14 PM'),
('2:27:37 PM', '2:39:14 PM'),
('2:27:47 PM', '2:27:55 PM'),
('2:29:04 PM', '2:29:26 PM'),
('2:29:31 PM', '2:29:43 PM'),
('2:29:45 PM', '2:30:10 PM'),
]
print(maximumOverlap(calls))

How about a naive approach:

Take the least of the start times and the greatest of the end times (this is your range R)
Take the shortest call duration -- d (sorting, O(nlog n))
Create an array C, of ceil(R/d) integers, zero initialize
Now, for each call, add 1 to the cells that define the call's duration O(n * ceil(R/d))
Loop over the array C and save the max (O(n))

I guess you could model this as a graph too and fiddle around, but beats me at the moment.

In my opinion greedy algorithm will do the needful. The problem is similar to find out the number of platforms required for given trains timetable. So the number of overlaps will be the number of platforms required.
callStart times are sorted. Start putting each call in an array(a platform). So for call i and (i + 1), if callEnd[i] > callStart[i+1] then they can not go in the same array (or platform) put as many calls in the first array as possible. Then repeat the process with rest ones till all calls are exhausted. In the end, number of arrays are maximum number of overlaps. And the complexity will be O(n).

The following page has examples of solving this problem in many languages: http://rosettacode.org/wiki/Max_Licenses_In_Use

You short the list on CallStart. Then for each element (i) you see for all j < i if

CallEnd[j] > CallStart[i] // put it in a map with CallStart[i]  as the key and some count

Rest should be easy enough.

It's amazing how for some problems solutions sometimes just pop out of one mind... and I think I probably the simplest solution ;)

You can represent the times in seconds, from the beginning of your range (0) to its end (600). A call is a pair of times.

Python algorithm:

def maxSimultaneousCalls(calls):
  """Returns the maximum number of simultaneous calls
  calls   : list of calls
    (represented as pairs [begin,end] with begin and end in seconds)
  """
  # Shift the calls so that 0 correspond to the beginning of the first call
  min = min([call[0] for call in calls])

  tmpCalls = [(call[0] - min, call[1] - min) for call in calls]
  max = max([call[1] for call in tmpCalls])

  # Find how many calls were active at each second during the interval [0,max]
  seconds = [0 for i in range(0,max+1)]
  for call in tmpCalls:
    for i in range(call[0],call[1]):
      seconds[i] += 1

  return max(seconds)

Note that I don't know which calls were active at this time ;)

But in term of complexity it's extremely trivial to evaluate: it's linear in term of the total duration of the calls.

I think an important element of good solution for this problem is to recognize that each end time is >= the call's start time and that the start times are ordered. So rather than thinking in terms of reading the whole list and sorting we only need to read in order of start time and merge from a min-heap of the end times. This also addresses the comment Sanjeev made about how ends should be processed before starts when they have the exact same time value by polling from the end time min-heap and choosing it when it's value is <= the next start time.

max_calls = 0
// A min-heap will typically do the least amount of sorting needed here.
// It's size tells us the # of currently active calls.
// Note that multiple keys with the same value must be supported.
end_times = new MinHeap()
for call in calls:
  end_times.add(call.end)
  while (end_times.min_key() <= call.start) {
    end_times.remove_min()
  }
  // Check size after calls have ended so that a start at the same time
  // doesn't count as an additional call.  
  // Also handles zero duration calls as not counting.
  if (end_times.size() > max_calls) max_calls = end_times.size()
}

This seems like a reduce operation. The analogy is that each time a call is started, the current number of active calls is increased by 1. Each time a call is ended, the current number of calls drops to zero.

Once you have that stream of active calls all you need is to apply a max operation to them. Here is a working python2 example:

from itertools import chain
inp = ((123, 125),
       (123, 130),
       (123, 134),
       (130, 131),
       (130, 131),
       (130, 132),)

# technical: tag each point as start or end of a call
data = chain(*(((a, 'start'), (b, 'end')) for a, b in inp))

def r(state, d):
    last = state[-1]
    # if a call is started we add one to the number of calls,
    # if it ends we reduce one
    current = (1 if d[1] == 'start' else -1)
    state.append(last + current)
    return state

max_intersect = max(reduce(r, sorted(data), [0]))

print max_intersect