开发者

Merge N sorted arrays in ruby lazily

开发者 https://www.devze.com 2023-02-28 10:59 出处:网络
How does one merge N sorted arrays (or other l开发者_如何转开发ist-like data structures) lazily in Ruby? For example, in Python you would use heapq.merge. There must be something like this built into

How does one merge N sorted arrays (or other l开发者_如何转开发ist-like data structures) lazily in Ruby? For example, in Python you would use heapq.merge. There must be something like this built into Ruby, right?


Here's a (slightly golfed) solution that should work on arrays of any 'list-like' collections that support #first, #shift, and #empty?. Note that it is destructive - each call to lazymerge removes one item from one collection.

def minheap a,i
  r=(l=2*(m=i)+1)+1 #get l,r index
  m = l if l< a.size and a[l].first < a[m].first
  m = r if r< a.size and a[r].first < a[m].first
  (a[i],a[m]=a[m],a[i];minheap(a,m)) if (m!=i)
end


def lazymerge a
  (a.size/2).downto(1){|i|minheap(a,i)}
  r = a[0].shift
  a[0]=a.pop if a[0].empty?
  return r
end

p arrs = [ [1,2,3], [2,4,5], [4,5,6],[3,4,5]]
v=true
puts "Extracted #{v=lazymerge (arrs)}.  Arr= #{arrs.inspect}" while v

Output:

[[1, 2, 3], [2, 4, 5], [4, 5, 6], [3, 4, 5]]
Extracted 1.  Arr= [[2, 3], [2, 4, 5], [4, 5, 6], [3, 4, 5]]
Extracted 2.  Arr= [[3], [2, 4, 5], [4, 5, 6], [3, 4, 5]]
Extracted 2.  Arr= [[4, 5], [3], [4, 5, 6], [3, 4, 5]]
Extracted 3.  Arr= [[4, 5], [3, 4, 5], [4, 5, 6]]
Extracted 3.  Arr= [[4, 5], [4, 5], [4, 5, 6]]
Extracted 4.  Arr= [[5], [4, 5], [4, 5, 6]]
Extracted 4.  Arr= [[5], [5], [4, 5, 6]]
Extracted 4.  Arr= [[5, 6], [5], [5]]
Extracted 5.  Arr= [[6], [5], [5]]
Extracted 5.  Arr= [[5], [6]]
Extracted 5.  Arr= [[6]]
Extracted 6.  Arr= [[]]
Extracted .  Arr= [[]]

Note also that this algorithm is also lazy about maintaining the heap property - it is not maintained between calls. This probably causes it to do more work than needed, since it does a complete heapify on each subsequent call. This could be fixed by doing a complete heapify once up front, then calling minheap(a,0) before the return r line.


I ended up writing it myself using the data structures from the 'algorithm' gem. It wasn't as bad as I expected.

require 'algorithms'

class LazyHeapMerger
  def initialize(sorted_arrays)
    @heap = Containers::Heap.new { |x, y| (x.first <=> y.first) == -1 }
    sorted_arrays.each do |a|
      q = Containers::Queue.new(a)
      @heap.push([q.pop, q])
    end
  end

  def each
    while @heap.length > 0
      value, q = @heap.pop
      @heap.push([q.pop, q]) if q.size > 0
      yield value
    end
  end
end

m = LazyHeapMerger.new([[1, 2], [3, 5], [4]])
m.each do |o|
  puts o
end


Here's an implementation which should work on any Enumerable, even infinite ones. It returns Enumerator.

def lazy_merge *list
  list.map!(&:enum_for) # get an enumerator for each collection
  Enumerator.new do |yielder|
    hash = list.each_with_object({}){ |enum, hash|
      begin
        hash[enum] = enum.next
      rescue StopIteration
        # skip empty enumerators
      end
    }
    loop do
      raise StopIteration if hash.empty?

      enum, value = hash.min_by{|k,v| v}
      yielder.yield value
      begin
        hash[enum] = enum.next
      rescue StopIteration
        hash.delete(enum) # remove enumerator that we already processed
      end
    end
  end
end

Infinity = 1.0/0 # easy way to get infinite range

p lazy_merge([1, 3, 5, 8], (2..4), (6..Infinity), []).take(12)
#=> [1, 2, 3, 3, 4, 5, 6, 7, 8, 8, 9, 10]


No, there's nothing built in to do that. At least, nothing that springs instantly to mind. However, there was a GSoC project to implement the relevant data types a couple of years ago, which you could use.

0

精彩评论

暂无评论...
验证码 换一张
取 消