开发者

How can I merge and order multiple lists together using Ruby?

开发者 https://www.devze.com 2023-01-19 07:53 出处:网络
I have 2 lists that have dates and data.Each list is in the proper order as noted by the sequence number.Now I need to merge the 2 lists together and keep everythi开发者_JAVA技巧ng in the correct orde

I have 2 lists that have dates and data. Each list is in the proper order as noted by the sequence number. Now I need to merge the 2 lists together and keep everythi开发者_JAVA技巧ng in the correct order.

For example:

List A

20101001 A data 1 seq1

20101001 A data 2 seq2

20101005 A data 3 seq3

List B

20101001 B data 1 seq1

20101003 B data 2 seq2

etc...

I need the new list to look like this:

20101001 A data 1 seq1

20101001 A data 2 seq2

20101001 B data 1 seq3

20101003 B data 2 seq4

20101005 A data 3 seq5

2 things that I thought of is merging the lists together and applying the sequence number prior to inserting them into a db or I can insert them into the db with the current sequence and pull them back out again to merge them together, but that seems like an extra step and kludgy.

Any ideas on the best way to go about this?


Assuming your lists are in Ruby Arrays, and the objects in the lists have attributes defined (such as obj.sequence_number), one way to merge and sort the lists would be:

First merge the lists as a union:

@merged_list = @list_a | @list_b

Then sort the merged_list with the appropriate sorting rule:

@merged_list.sort! {|a, b| a.date <=> b.date # or whatever your sorting rule is... }

Edit:

Once the merged array is sorted, you can re-define the sequence_number:

@merged_list.each_with_index {|obj, index| obj.sequence_number = "seq#{index+1}"}

Edit:

Same thing applies if your objects in the lists are themselves just simple arrays:

@merged_list.sort! {|a, b| a[0] <=> b[0] # or whatever your sorting rule is... }
@merged_list.each_with_index {|obj, index| obj[2] = "seq#{index+1}"}


Try this:

(listA + listB).sort!{|a, b| a.sequence_no <=> b.sequence_no}


This is an algorithm for merging an arbitrary number of sorted lists in more or less linear time:

def merge_sorted(*lists)
  # the lists will be modified, so make (shallow) copies
  lists = lists.map(&:dup)
  result = []
  loop do
    # ignore lists that have been exhausted
    lists = lists.reject(&:empty?)
    # we're done if all lists have been exhausted
    break if lists.empty?
    # find the list with the smallest first element
    top = lists.inject do |candidate, other|
      candidate.first < other.first ? candidate : other
    end
    result << top.shift
  end
  result
end

list1 = [1, 2, 5, 6, 9]
list2 = [2, 3, 4, 11, 13]
list3 = [1, 2, 2, 2, 3]

p merge_sorted(list1, list2, list3)
  # => [1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 5, 6, 9, 11, 13]

For each iteration it finds the list with the smallest first element, and shifts this element off of it onto the results list. It does this until all lists are empty.

I say more or less linear time since it's actually O(n × m) where n is the number of lists and m is the total number of elements in the lists but I think this can safely be simplified to O(m) for most cases since n will be small in comparison to m.


This uses with_index which is a nice way to add an index value to an iterator:

result = (list_a + list_b).sort_by { |a| a[0 .. -2] }.map.with_index { |a, i| a[0 .. -2] + (1 + i).to_s }
puts result
# >> 20101001 A data 1 seq1
# >> 20101001 A data 2 seq2
# >> 20101001 B data 1 seq3
# >> 20101003 B data 2 seq4
# >> 20101005 A data 3 seq5

Here's some variations with benchmarks:

require 'benchmark'

list_a = [
  '20101001 A data 1 seq1',
  '20101001 A data 2 seq2',
  '20101005 A data 3 seq3'
]

list_b = [
  '20101001 B data 1 seq1',
  '20101003 B data 2 seq2'
]

# #1
result = (list_a + list_b).sort_by { |a| a[0 .. -2] }.map.with_index { |a, i| a[0 .. -2] + (1 + i).to_s }
result # => ["20101001 A data 1 seq1", "20101001 A data 2 seq2", "20101001 B data 1 seq3", "20101003 B data 2 seq4", "20101005 A data 3 seq5"]

# #2
result = (list_a + list_b).map{ |r| r[0 .. -2] }.sort.map.with_index { |a, i| a + (1 + i).to_s }
result # => ["20101001 A data 1 seq1", "20101001 A data 2 seq2", "20101001 B data 1 seq3", "20101003 B data 2 seq4", "20101005 A data 3 seq5"]

# #3
i = 0
result = (list_a + list_b).map{ |r| r[0 .. -2] }.sort.map { |a| i += 1; a + i.to_s }
result # => ["20101001 A data 1 seq1", "20101001 A data 2 seq2", "20101001 B data 1 seq3", "20101003 B data 2 seq4", "20101005 A data 3 seq5"]

# #4
i = 0; result = (list_a + list_b).sort.map { |a| i += 1; a[-1] = i.to_s; a }
result # => ["20101001 A data 1 seq1", "20101001 A data 2 seq2", "20101001 B data 1 seq3", "20101003 B data 2 seq4", "20101005 A data 3 seq5"]

n = 75000
Benchmark.bm(7) do |x|
  x.report('#1') { n.times { (list_a + list_b).sort_by { |a| a[0 .. -2] }.map.with_index { |a, i| a[0 .. -2] + (1 + i).to_s } } } 
  x.report('#2') { n.times { (list_a + list_b).map{ |r| r[0 .. -2] }.sort.map.with_index { |a, i| a + (1 + i).to_s } } }
  x.report('#3') { n.times { i = 0; (list_a + list_b).map{ |r| r[0 .. -2] }.sort.map { |a| i += 1; a + i.to_s } } }
  x.report('#4') { n.times { i = 0; (list_a + list_b).sort.map { |a| i += 1; a[-1] = i.to_s } } }
end
# >>              user     system      total        real
# >> #1       1.150000   0.000000   1.150000 (  1.147090)
# >> #2       0.880000   0.000000   0.880000 (  0.880038)
# >> #3       0.720000   0.000000   0.720000 (  0.727135)
# >> #4       0.580000   0.000000   0.580000 (  0.572688)

It's good to benchmark.

0

精彩评论

暂无评论...
验证码 换一张
取 消