开发者

How to combine two list that contain intervals

开发者 https://www.devze.com 2023-03-02 14:01 出处:网络
I have list 1 and 2 and I would like to get list 3. If anyone can suggest python or awk script, that would be great.

I have list 1 and 2 and I would like to get list 3. If anyone can suggest python or awk script, that would be great.

List 1
A   100-160
B   200-500
C   800-1500
D   1600-2000
E   2500-3000

List 2
开发者_JAVA百科150
600
900
1700
2400


List 3
A   100-160        150
B   200-500 
C   800-1500    900
D   1600-2000   1700
E   2500-3000   


Here is an example. It is expecting two filenames to be passed on the command line.

import sys

if len(sys.argv) != 3:
    print 'parameters: list1 list2'
    sys.exit(1)

list1 = []
for line in file(sys.argv[1]):
    fields = line.split()
    f1 = fields[0]
    f2, f3 = fields[1].split('-')
    list1.append((f1, int(f2), int(f3), [], ))

for line in file(sys.argv[2]):
    value = int(line)
    for name, lb, ub, values in list1:
        if value >= lb and value <= ub:
            values.append(str(value))

for name, lb, ub, values in list1:
    if values: vals = ','.join(values)
    else: vals = ''
    print '%s %d-%d %s' % (name, lb, ub, vals, )


You could make a dictionary from list 1 that contains the intervals then loops through it to see if any value in list 2 is inside the range. For example,

list1 = {"A": [100, 160], "B": [200, 500], "C": [800, 1500],
         "D": [1600, 2000], "E": [2500,3000]}

list2 = [150, 600, 900, 1700, 2400]

for key, val in list1.iteritems():
    for num in list2:
        if num in range(val[0], val[1]):
            val.append(num)

for key, val in sorted(list1.iteritems()):
    print key, ":", val


You could do something like this in Python:

L1 = [(100, 160), (200, 500), (800, 1500), (1600, 2000), (2500, 3000)]
L2 = [150, 600, 900, 1700, 2400]
L3 = [((a, b), [i for i in L2 if a<=i<b]) for (a, b) in L1]

It's easy to parse the data into that structure if that's what you want (and print it back out), but I'll wait until you explain what format the data comes in and what format you need it in, because I have a feeling there will be a catch to it.


Do it simply on the command line.So far this is the only awk solution among all:

 paste list1 list2|awk '{split($2,a,"-");
                           if($3>a[1] && $3<a[2])
                           {h=$3}
                           else
                           {h=""};
                           print $1,$2,h}'

A 100-160 150
B 200-500
C 800-1500 900
D 1600-2000 1700
E 2500-3000


Dunno for sure if Ruby is needed, but here's a quick pass at it:

list1 = %w[ A 100-160 B 200-500 C 800-1500 D 1600-2000 E 2500-3000 ]
list2 = %w[ 150, 600, 900, 1700, 2400 ]

list3 = []

list1.each_slice(2) do |char, range|
  min_range, max_range = range.split('-').map{ |i| i.to_i }
  l2 = list2.shift
  case l2.to_i
  when min_range..max_range
    list3 << [ char, range, l2 ]
  else
    list3 << [ char, range ]
  end
end

require 'pp'
pp list3

>> [["A", "100-160", "150,"],
>>  ["B", "200-500"],
>>  ["C", "800-1500", "900,"],
>>  ["D", "1600-2000", "1700,"],
>>  ["E", "2500-3000"]]


You can use iterable tools to write this. Not all too readable but this works. Definitely was fun to write!

from itertools import chain

nameranges = ['A', '100-160', 'B', '200-500', 'C', '800-1500', 'D', '1600-2000', 
              'E', '2500-3000']

values = [150, 600, 900, 1700, 2400]
z = zip(nameranges[0::2], ( map(int, x.split("-")) for x in nameranges[1::2]))
f = list(chain(* (map(lambda x, y: (x[0][0], x[1][0], x[1][1], y) 
               if x[1][0]<=y<=x[1][1] else (x[0], x[1][0], x[1][1]), 
               z, values))))
# print f
#['A', 100, 160, 150, 'B', 200, 500, 'C', 800, 1500, 900, 'D', 1600, 2000, 1700, 
# 'E', 2500, 3000]


Here's my take. I just made up the parsing and printing bit since you didn't specify.

list_1 = ['A\t100-160', 'B\t200-500', 'C\t800-1500', 'D\t1600-2000', 'E\t2500-3000']
list_2 = [150, 600, 900, 1700, 2400]

for range in list_1:
    # parse the input (this may be different, but you didn't specify)
    lower_bound, upper_bound = range.split('\t')[1].split('-')  # this is a bit fragile
    lower_bound = int(lower_bound)
    upper_bound = int(upper_bound)

    # make sure lower_bound is less than upper_bound
    if upper_bound < lower_bound:
        (lower_bound, upper_bound) = (upper_bound, lower_bound)

    # loop over list 2 and see if any fall into the current range
    items_in_range = [str(number) for number in list_2 if lower_bound <= number < upper_bound]

    # output List 3
    print range + '\t' + ','.join(items_in_range)
0

精彩评论

暂无评论...
验证码 换一张
取 消