开发者

Merging Two Outer Lists Based on Iterated Inner List Value

开发者 https://www.devze.com 2023-04-06 04:03 出处:网络
I have two lists of lists - i.e. [[\'1\', \'expired\', \'test\', \'0\'], [\'31\', \'active\', \'test\', \'1\']]

I have two lists of lists - i.e.

[['1', 'expired', 'test', '0'], ['31', 'active', 'test', '1']]

as well as

[['1', 'Andrew', 'Alexander'], ['31', 'John', 'Smith']]

Lets call them list1 and list2

I want to merge list1 and list2, but ONLY when (note, this is pseudocode, trying to figure out how to program this in Python)

x[0] in list1 == x[0] in list2

I'm not sure how to write this out.

By merge开发者_C百科 I mean (pseudocode)

list[x] = list1[x] + list2[x] while x[0] in list1 == x[0] in list2

Output desired:

[['1', 'expired', 'test', '0', '1', 'Andrew', 'Alexander'], ['31', 'active', 'test', '1', '31', 'John', 'Smith']]

The only critical point is that not all of the x[0]'s are going to match up perfectly.


Using agf's idea of employing a collections.defaultdict, this in O(m+n) where m and n are the lengths of the lists.

import collections
import itertools

x=[['1', 'expired', 'test', '0'], ['31', 'active', 'test', '1']]
y=[['1', 'Andrew', 'Alexander'], ['31', 'John', 'Smith']]

result=collections.defaultdict(list)
for item in itertools.chain(x,y):
    result[item[0]].append(item)
result=[list(itertools.chain.from_iterable(value)) for value in result.values()]
print(result)

yields

[['1', 'expired', 'test', '0', '1', 'Andrew', 'Alexander'], ['31', 'active', 'test', '1', '31', 'John', 'Smith']]

In the comments the OP says the desired output is

[['1', 'expired', 'test', '0', 'Andrew', 'Alexander'], ['31', 'active', 'test', '1', 'John', 'Smith']]

(this is different than the desired output posted in the original question.)

Then:

import collections
import itertools

x=[['1', 'expired', 'test', '0'], ['31', 'active', 'test', '1']]
y=[['1', 'Andrew', 'Alexander'], ['31', 'John', 'Smith']]

result={}
for item in itertools.chain(x,y):
    result.setdefault(item[0],item[:1]).extend(item[1:])
result=result.values()
print(result)

This is one of the few times I've found using setdefault more convenient than collections.defaultdict.


If you want [[1, 'a'], [2, 'b']] and [[1, 'c'], [3, 'd']] merged to [[1, 'a', 'c'], [2, 'b'], [3, 'd']]:

from collections import defaultdict
dict1_2 = defaultdict(list)
dict1_2.update((item[0], item[1:]) for item in list1)
for item in list2:
    dict1_2[item[0]].append(item[1:])

if you want them merged to [[1, 'a', 'c']]:

dict1 = dict((item[0], item[1:]) for item in list1)
dict1_2 = {}
for item in list2:
    key = item[0]
    if key in dict1:
        dict1_2[key] = dict1[key] + item[1:]

You're using the item[0] as keys, so you should use a datatype that fits that. In this case, that's a dictionary / mapping.

This works (on average) in linear time, O(m+n) (where m and n are the lengths of the lists). Any solution using nested loops or similar will be O(m*n)

If you really need the data back as a list, you can do

list1_2 = [[key] + value for key, value in dict1_2.iteritems()]


resultlist = []
for x in list1:
    for y in list2:
        if x[0] == y[0]:
            resultlist.append(x+y)


Not the best way, but definitely concise and hard to read, if that's what you're after:

>>> l1 = [['1', 'expired', 'test', '0'], ['31', 'active', 'test', '1']]
>>> l2 = [['1', 'Andrew', 'Alexander'], ['31', 'John', 'Smith']]

>>> [sl1 + list(*[sl2[1:] for sl2 in l2 if sl2[0]==sl1[0]]) for sl1 in l1]

[['1', 'expired', 'test', '0', 'Andrew', 'Alexander'], ['31', 'active', 'test', '1', 'John', 'Smith']]

Please don't actually use this in any real code.


l1 = [['1', 'expired', 'test', '0'], ['31', 'active', 'test', '1']]
l2 = [['1', 'Andrew', 'Alexander'], ['31', 'John', 'Smith'], ['51', 'Johnny', 'Nomatch']]

from itertools import groupby, chain
from operator import itemgetter

all = sorted(chain(l1,l2), key=itemgetter(0)) # puts the related lists together
groups = groupby(all, itemgetter(0)) # groups them by first element
chains = (group for key, group in groups) # get each group
print [list(chain.from_iterable(g)) for g in chains] # merge them

It's a oneliner ;-)

Items that don't match are included. You can filter them out by simply checking len(group) > 4.

0

精彩评论

暂无评论...
验证码 换一张
取 消