开发者

How to combine overlapping time ranges (time ranges union)

开发者 https://www.devze.com 2023-03-06 00:41 出处:网络
I have an array with several time ranges inside: [Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,

I have an array with several time ranges inside:

[Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
 Tue, 24 May 2011 16:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00,
 Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 09:00:00 CEST +02:00,
 Tue, 24 May 2011 15:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]

I want to get the same array with the overlapping time ranges combined, so the output for this case will be:

[Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
 Tue, 24 May 2011 15:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]

So it creates a new time range when to time ranges overlap, and so on. If they don´t overlap the will be keep separated. Another example:

开发者_C百科

Input:

[Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
 Tue, 24 May 2011 16:00:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]

Output (will be the same because they don´t overlap):

[Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
 Tue, 24 May 2011 16:00:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]

I was thinking in some recursive approach, but I need some guidance here...


Given a function that returns truthy if two ranges overlap:

def ranges_overlap?(a, b)
  a.include?(b.begin) || b.include?(a.begin)
end

(this function courtesy of sepp2k and steenslag)

and a function that merges two overlapping ranges:

def merge_ranges(a, b)
  [a.begin, b.begin].min..[a.end, b.end].max
end

then this function, given an array of ranges, returns a new array with any overlapping ranges merged:

def merge_overlapping_ranges(overlapping_ranges)
  overlapping_ranges.sort_by(&:begin).inject([]) do |ranges, range|
    if !ranges.empty? && ranges_overlap?(ranges.last, range)
      ranges[0...-1] + [merge_ranges(ranges.last, range)]
    else
      ranges + [range]
    end
  end
end


Searching a little bit I have found a code that does the trick:

def self.merge_ranges(ranges)
  ranges = ranges.sort_by {|r| r.first }
  *outages = ranges.shift
  ranges.each do |r|
    lastr = outages[-1]
    if lastr.last >= r.first - 1
      outages[-1] = lastr.first..[r.last, lastr.last].max
    else
      outages.push(r)
    end
  end
  outages
end

A sample (working with time ranges too!):

ranges = [1..5, 20..20, 4..11, 40..45, 39..50]
merge_ranges(ranges)
=> [1..11, 20..20, 39..50]

Found here: http://www.ruby-forum.com/topic/162010


You can do it by using multi_range gem.

Example 1:

ranges = [
  Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00..Tue')..Time.parse('24 May 2011 13:00:00 CEST +02:00'),
  Time.parse('Tue, 24 May 2011 16:30:00 CEST +02:00..Tue')..Time.parse('24 May 2011 18:00:00 CEST +02:00'),
  Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00..Tue')..Time.parse('24 May 2011 09:00:00 CEST +02:00'),
  Time.parse('Tue, 24 May 2011 15:30:00 CEST +02:00..Tue')..Time.parse('24 May 2011 18:00:00 CEST +02:00'),
]

MultiRange.new(ranges).merge_overlaps.ranges
# => [2011-05-24 08:00:00 +0800..2011-05-24 13:00:00 +0800, 2011-05-24 15:30:00 +0800..2011-05-24 18:00:00 +0800] 

Example 2:

ranges = [
  Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00')..Time.parse('Tue, 24 May 2011 13:00:00 CEST +02:00'),
  Time.parse('Tue, 24 May 2011 16:00:00 CEST +02:00')..Time.parse('Tue, 24 May 2011 18:00:00 CEST +02:00'),
]

MultiRange.new(ranges).merge_overlaps.ranges
# => [2011-05-24 08:00:00 +0800..2011-05-24 13:00:00 +0800, 2011-05-24 16:00:00 +0800..2011-05-24 18:00:00 +0800] 


The facets gem has Range.combine method that may be of use: http://rdoc.info/github/rubyworks/facets/master/Range#combine-instance_method


Some kind of algorithm that might help:

Sort range array by start time (r1, r2, r3, r4, .. rn)

for each range pair [r1, r2], [r2, r3] .. [rn-1, rn]:
    if r1_end > r2_start: # they overlap
        add [r1_start, r2_end] to new range array
    else: # they do not overlap
        add [r1] and [r2] to new range array (no changes)

startover with the new range array until no more changes


The solution, offered by @wayne-conrad is a very good one. I implemented it for a problem, I stumbled upon. Then I implemented an iterative version and benchmarked the two. It appears, the iterative version is quicker. Note: I use ActiveSupport for Range#overlaps? and the time helpers, but it is trivial to implement a pure-Ruby version.

require 'active_support/all'

module RangesUnifier
  extend self

  # ranges is an array of ranges, e.g. [1..5, 2..6] 
  def iterative_call(ranges)
    ranges.sort_by(&:begin).reduce([ranges.first]) do |merged_ranges, range|
      if merged_ranges.last.overlaps?(range)
        merged_ranges[0...-1] << merge_ranges(merged_ranges.last, range)
      else
        merged_ranges << range
      end
    end
  end

  def recursive_call(ranges)
    return ranges if ranges.size == 1

    if ranges[0].overlaps?(ranges[1])
      recursive_call [merge_ranges(ranges[0], ranges[1]), *ranges[2..-1]]
    else
      [ranges[0], *recursive_call(ranges[1..-1])]
    end
  end

  def merge_ranges(a, b)
    [a.begin, b.begin].min..[a.end, b.end].max
  end
end

five_hours_ago = 5.hours.ago
four_hours_ago = 4.hours.ago
three_hours_ago = 3.hours.ago
two_hours_ago = 2.hours.ago
one_hour_ago = 1.hour.ago
one_hour_from_now = 1.hour.from_now
two_hours_from_now = 2.hours.from_now
three_hours_from_now = 3.hours.from_now
four_hours_from_now = 4.hours.from_now
five_hours_from_now = 5.hours.from_now

input = [
  five_hours_ago..four_hours_ago,
  three_hours_ago..two_hours_from_now,
  one_hour_ago..one_hour_from_now,
  one_hour_from_now..three_hours_from_now,
  four_hours_from_now..five_hours_from_now
]

RangesUnifier.iterative_call(input) 
#=> [
# 2017-08-21 12:50:50 +0300..2017-08-21 13:50:50 +0300, 
# 2017-08-21 14:50:50 +0300..2017-08-21 20:50:50 +0300, 
# 2017-08-21 21:50:50 +0300..2017-08-21 22:50:50 +0300
# ]

RangesUnifier.recursive_call(input)
#=> [
# 2017-08-21 12:50:50 +0300..2017-08-21 13:50:50 +0300, 
# 2017-08-21 14:50:50 +0300..2017-08-21 20:50:50 +0300, 
# 2017-08-21 21:50:50 +0300..2017-08-21 22:50:50 +0300
# ]

n = 100_000    

Benchmark.bm do |x|
  x.report('iterative') { n.times { RangesUnifier.iterative_call(input) } }
  x.report('recursive') { n.times { RangesUnifier.recursive_call(input) } }
end

# =>
#        user     system      total        real
# iterative  0.970000   0.000000   0.970000 (  0.979549)
# recursive  0.540000   0.010000   0.550000 (  0.546755)


The gem range_operators does a wonderful job by adding missing features to the Ruby Range class. It is way smaller than adding the whole facets gem.

I your case the solution would be the rangify method, which is added to the Array class and would do exactly what you are looking for.


I made a minor update to the answer from Wayne Conrad to handle edge cases involved with open-ended arrays (created with ... operator instead of .. operator).

I changed the name to merge_continuous_ranges since while ranges like 0...1 and 1..2 do not overlap, their combined ranges are continuous, so it makes sense to combine them:

def merge_continuous_ranges(ranges)
  ranges.sort_by(&:begin).inject([]) do |result, range|
    if !result.empty? && ranges_continuous?(result.last, range)
      result[0...-1] + [merge_ranges(result.last, range)]
    else
      result + [range]
    end
  end
end

def ranges_continuous?(a, b)
  a.include?(b.begin) || b.include?(a.begin) || a.end == b.begin || b.end == a.begin
end

def merge_ranges(a, b)
  range_begin = [a.begin, b.begin].min
  range_end = [a.end, b.end].max

  exclude_end = case a.end <=> b.end
  when -1
    b.exclude_end?
  when 0
    a.exclude_end? && b.exclude_end?
  when 1
    a.exclude_end?
  end

  exclude_end ? range_begin...range_end : range_begin..range_end
end


A solution in one method and bug-free for what I can tell:

def merge_ranges(ranges)
  ranges = ranges.sort_by(&:first)
  merged = [ranges[0]]

  ranges.each do |current|
    previous = merged[-1]
    if current.first <= previous.last
      merged[-1] = previous.first..[previous.last, current.last].max
    else
      merged.push(current)
    end
  end

  merged
end

Usage:

ranges = [
  Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00..Tue')..Time.parse('24 May 2011 13:00:00 CEST +02:00'),
  Time.parse('Tue, 24 May 2011 16:30:00 CEST +02:00..Tue')..Time.parse('24 May 2011 18:00:00 CEST +02:00'),
  Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00..Tue')..Time.parse('24 May 2011 09:00:00 CEST +02:00'),
  Time.parse('Tue, 24 May 2011 15:30:00 CEST +02:00..Tue')..Time.parse('24 May 2011 18:00:00 CEST +02:00'),
]
merge_ranges(ranges)

#=> [2011-05-24 08:00:00 +0200..2011-05-24 13:00:00 +0200, 2011-05-24 15:30:00 +0200..2011-05-24 18:00:00 +0200]

Disclaimer: it's a port of https://stackoverflow.com/a/43600953/807442


Dont you just want to find the smallest first value and the largest last value from the set of arrays?

ranges = [Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
 Tue, 24 May 2011 16:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00,
 Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 09:00:00 CEST +02:00,
 Tue, 24 May 2011 15:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]

union = [ranges.collect(&:first).sort.first, ranges.collect(&:last).sort.last]


The Marked answer works well except for few use cases. One of such use case is

[Tue, 21 June 13:30:00 GMT +0:00..Tue, 21 June 15:30:00 GMT +00:00,
Tue, 21 June 14:30:00 GMT +0:00..Tue, 21 June 15:30:00 GMT +00:00]

The condition in ranges_overlap does not handle this use case. So I wrote this

def ranges_overlap?(a, b)
    a.include?(b.begin) || b.include?(a.begin) || a.include?(b.end) || b.include?(a.end)|| (a.begin < b.begin && a.end >= b.end) || (a.begin >= b.begin && a.end < b.end)
end

This is handling all the edge cases for me so far.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号