开发者

List of minimal pairs from a pair of lists

开发者 https://www.devze.com 2023-01-27 05:29 出处:网络
Given two lists of integers, generate the shortest list of pairs where every value in both lists is present. The first of each pair must be a value from the first list, and the second of each pair mus

Given two lists of integers, generate the shortest list of pairs where every value in both lists is present. The first of each pair must be a value from the first list, and the second of each pair must be a value from the second list. The first of each pair must be less than the se开发者_如何学Gocond of the pair.

A simple zip will not work if the lists are different lengths, or if the same integer exists at the same position in each list.

def gen_min_pairs(uplist, downlist):
    for pair in zip(uplist, downlist):
        yield pair

Here is what I can come up with so far:

def gen_min_pairs(uplist, downlist):
    up_gen = iter(uplist)
    down_gen = iter(downlist)

    last_up = None
    last_down = None

    while True:
        next_out = next(up_gen, last_up)
        next_down = next(down_gen, last_down)

        if (next_up == last_up and
            next_down == last_down):
            return

        while not next_up < next_down:
            next_down = next(down_gen, None)
            if next_down is None:
                return
        yield next_up, next_down

        last_up = next_up
        last_down = next_down

And here is a simple test routine:

if __name__ == '__main__':
    from pprint import pprint

    datalist = [
        {
            'up': [1,7,8],
            'down': [6,7,13]
        },
        {
            'up': [1,13,15,16],
            'down': [6,7,15]
        }
    ]

    for dates in datalist:    
        min_pairs = [pair for pair in
                     gen_min_pairs(dates['up'], dates['down'])]
        pprint(min_pairs)

The program produces the expect output for the first set of dates, but fails for the second.

Expected:

[(1, 6), (7, 13), (8, 13)]
[(1, 6), (1, 7), (13, 15)]

Actual:

[(1, 6), (7, 13), (8, 13)]
[(1, 6), (13, 15)]

I think this can be done while only looking at each element of each list once, so in the complexity O(len(up) + len(down)). I think it depends on the number elements unique to each list.

EDIT: I should add that we can expect these lists to be sorted with the smallest integer first.

EDIT: uplist and downlist were just arbitrary names. Less confusing arbitrary ones might be A and B.

Also, here is a more robust test routine:

from random import uniform, sample
from pprint import pprint

def random_sorted_sample(maxsize=6, pop=31):
    size = int(round(uniform(1,maxsize)))
    li = sample(xrange(1,pop), size)
    return sorted(li)

if __name__ == '__main__':
    A = random_sorted_sample()
    B = random_sorted_sample()

    min_pairs = list(gen_min_pairs(A, B))

    pprint(A)
    pprint(B)
    pprint(min_pairs)

This generates random realistic inputs, calculates the output, and displays all three lists. Here is an example of what a correct implementation would produce:

[11, 13]
[1, 13, 28]
[(11, 13), (13, 28)]

[5, 15, 24, 25]
[3, 13, 21, 22]
[(5, 13), (15, 21), (15, 22)]

[3, 28]
[4, 6, 15, 16, 30]
[(3, 4), (3, 6), (3, 15), (3, 16), (28, 30)]

[2, 5, 20, 24, 26]
[8, 12, 16, 21, 23, 28]
[(2, 8), (5, 12), (5, 16), (20, 21), (20, 23), (24, 28), (26, 28)]

[3, 4, 5, 6, 7]
[1, 2]
[]


I had many ideas to solve this (see edit history ;-/) but none of them quite worked out or did it in linear time. It took me a while to see it, but I had a similar problem before so I really wanted to figure this out ;-)

Anyways, in the end the solution came when I gave up on doing it directly and started drawing graphs about the matchings. I think your first list simply defines intervals and you're looking for the items that fall into them:

def intervals(seq):
    seq = iter(seq)
    current = next(seq)
    for s in seq:
        yield current,s
        current = s
    yield s, float("inf")

def gen_min_pairs( fst, snd):
    snd = iter(snd)
    s = next(snd)
    for low, up in intervals(fst):
        while True:
            # does it fall in the current interval
            if low < s <= up:
                yield low, s
                # try with the next
                s = next(snd)
            else:
                # nothing in this interval, go to the next
                break


zip_longest is called izip_longest in python 2.x.

import itertools    

def MinPairs(up,down):
    if not (up or down):
        return []
    up=list(itertools.takewhile(lambda x:x<down[-1],up))
    if not up:
        return []
    down=list(itertools.dropwhile(lambda x:x<up[0],down))
    if not down:
        return []
    for i in range(min(len(up),len(down))):
        if up[i]>=down[i]:
            up.insert(i,up[i-1])
    return tuple(itertools.zip_longest(up,down,fillvalue=(up,down)[len(up)>len(down)][-1]))


While not a complete answers (i.e. no code), have you tried looking at the numpy "where" module?

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号