I have a list composed of [start position, stop position, [sample names with those positions]]
My goal is to remove the duplicates with exact start and stop positions and just add the extra sample to the sample names section. The problem I'm encountering is that when I delete from the list, I end up with an out of range error, because it's not recalculating the len(list)
within the loops.
for g in range (len(list)) :
for n in range(len(list)):
#compares the start and stop position of one line to the start and stop of another line
if (list[g][0]==list[n+1][0] and list[g][1]==[n+1][1])
#adds new sam开发者_开发百科ple numbers to first start and stop entry with duplication
labels1=list[g][2]
labels2=list[n+1][2]
labels=labels1+labels2
list[g][2]=labels
#now delete the extra line
del list[n+1]
I not sure I understand what you want, but it might be this:
from collections import defaultdict
d = defaultdict(list)
for start, stop, samples in L1:
d[start, stop].extend(samples)
L2 = [[start, stop, samples] for (start, stop), samples in d.items()]
Which will take L1:
L1 = [ [1, 5, ["a", "b", "c"]], [3, 4, ["d", "e"]], [1, 5, ["f"]] ]
and make L2:
L2 = [ [1, 5, ["a", "b", "c", "f"]], [3, 4, ["d", "e"]] ]
Please note that this does not guarantee the same order of the elements in L2 as in L1, but from the looks of your question, that doesn't matter.
Your loops should not be for loops, they should be while loop with an increment step. I guess you can just manually check the condition within your for loop (continue
if it's not met), but a while loop makes more sense, imo.
Here is truppo's answer, re-written to preserve the order of entries from L1. It has a few other small changes, such as using a plain dict instead of a defaultdict, and using explicit tuples instead of packing and unpacking them on the fly.
L1 = [ [1, 5, ["a", "b", "c"]], [3, 4, ["d", "e"]], [1, 5, ["f"]] ]
d = {}
oplist = [] # order-preserving list
for start, stop, samples in L1:
tup = (start, stop) # make a tuple out of start/stop pair
if tup in d:
d[tup].extend(samples)
else:
d[tup] = samples
oplist.append(tup)
L2 = [[tup[0], tup[1], d[tup]] for tup in oplist]
print L2
# prints: [[1, 5, ['a', 'b', 'c', 'f']], [3, 4, ['d', 'e']]]
I've just put together a nice little list comprehension that does pretty much what you did, except without the nasty del
s.
from functools import reduce
from operator import add
from itertools import groupby
data = [
[1, 1, [2, 3, 4]],
[1, 1, [5, 7, 8]],
[1, 3, [2, 8, 5]],
[2, 3, [1, 7, 9]],
[2, 3, [3, 8, 5]],
]
data.sort()
print(
[[key[0], key[1], reduce(add, (i[2] for i in iterator))]
for key, iterator in groupby(data, lambda item: item[:2])
]
)
精彩评论