开发者

Python - compare nested lists and append matches to new list?

开发者 https://www.devze.com 2022-12-24 07:04 出处:网络
I wish to compare to nested lists of unequal length. I am interested only in a match between the first element of each sub list. Should a match exist, I wish to add the match to another list for subse

I wish to compare to nested lists of unequal length. I am interested only in a match between the first element of each sub list. Should a match exist, I wish to add the match to another list for subsequent transformation into a tab delimited file. Here is an example of what I am working with:

x = [['1', 'a', 'b'], ['2', 'c', 'd']]

y = [['1', 'z', 'x'], ['4', 'z', 'x']]

match = []

def find_match():
    for i in x:
        for j in y:
            if i[0] =开发者_开发问答= j[0]:
                 match.append(j)
            return match

This returns:

[['1', 'x'], ['1', 'y'], ['1', 'x'], ['1', 'y'], ['1', 'z', 'x']]

Would it be good practise to reprocess the list to remove duplicates or can this be done in a simpler fashion?

Also, is it better to use tuples and/or tuples of tuples for the purposes of comparison?

Any help is greatly appreciated.

Regards, Seafoid.


  • Use sets to obtain collections with no duplicates.

    • You'll have to use tuples instead of lists as the items because set items must be hashable.
  • The code you posted doesn't seem to generate the output you posted. I do not have any idea how you are supposed to generate that output from that input. For example, the output has 'y' and the input does not.

  • I think the design of your function could be much improved. Currently you define x, y, and match as the module level and read and mutate them explicitly. This is not how you want to design functions—as a general rule, a function shouldn't mutate something at the global level. It should be explicitly passed everything it needs and return a result, not implicitly receive information and change something outside itself.

    I would change

    x = some list
    y = some list
    match = []
    def find_match():
        for i in x:
            for j in y:
                if i[0] == j[0]:
                     match.append(j)
        return match # This is the only line I changed. I think you meant 
                     # your return to be over here?
    find_match()
    

    to

    x = some list
    y = some list
    
    def find_match(x, y):
        match = []
        for i in x:
            for j in y:
                if i[0] == j[0]:
                     match.append(j)
         return match
    match = find_match(x, y)
    
  • To take that last change to the next level, I usually replace the pattern

    def f(...):
        return_value = []
        for...
            return_value.append(foo)
        return return_value
    

    with the similar generator

    def f(...):
        for...
            yield foo
    

    which would make the above function

    def find_match(x, y):
        for i in x:
            for j in y:
                if i[0] == j[0]:
                     yield j
    

    another way to express this generator's effect is with the generator expression (j for i in x for j in y if i[0] == j[0]).


I don't know if I interpret your question correctly, but given your example it seems that you might be using a wrong index:

change

if i[1] == j[1]:

into

if i[0] == j[0]:


You can do this a lot more simply by using sets.

set_x = set([i[0] for i in x])
set_y = set([i[0] for i in y])
matches = list(set_x & set_y)


if i[1] == j[1]

checks whether the second elements of the arrays are identical. You want if i[0] == j[0].

Otherwise, I find your code quite readable and wouldn't necessarily change it.


A simplier expression should work here too:

list_of_lists = filter(lambda l: l[0][0] == l[1][0], zip(x, y))
map(lambda l: l[1], list_of_lists)
0

精彩评论

暂无评论...
验证码 换一张
取 消