I want to compare a list of lists that have the same length, but differ in their content. My script should return only the positions that share exactly the same element (in all lists). For example:
l = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,4,5,7,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]
and as a result I get a list of positions开发者_StackOverflow p = [3,7] as in all list we have '4' and '8' at positions 3 and 7, respectively.
These elements can be strings as well, I'm just giving an example with integers. Thanks for any help!
l = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,4,5,7,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]
p = [i for i, j in enumerate(zip(*l)) if all(j[0]==k for k in j[1:])]
# p == [3] - because of some typo in your original list, probably too many elements in the second list.
This is just the one-liner (list comprehension) version of this, more verbose:
p = []
for i, j in enumerate(zip(*l)):
if all(j[0]==k for k in j[1:]):
p.append(i)
zip(*l)
gives you:
[(1, 9, 5, 0),
(2, 8, 6, 0),
(3, 8, 7, 1),
(4, 4, 4, 4),
(5, 3, 9, 7),
(6, 4, 9, 6),
(7, 5, 9, 3),
(8, 7, 8, 8)]
enumerate()
puts numbers 0, 1, 2, ... to each tuple within that list.
all(j[0]==k for k in j[1:])
compares the first element of the tuple with all remaining elements and returns True
if all of them are equal, False
otherwise (it returns False
as soon as it finds a different element, so it's faster)
I liked eumiro solution, but I did with a set
p = [i for i, j in enumerate(zip(*l)) if len(set(j)) == 1]
l = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,4,5,7,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]
r = []
for i in range(len(l[0])):
e = l[0][i]
same = True
for j in range(1, len(l)):
if e != l[j][i]:
same = False
break
if same:
r.append(i)
print r
prints only [3], as l[1] does not have 8 at position 7. It have one more element.
li = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,6,5,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]
first = li[0]
r = range(len(first))
for current in li[1:]:
r = [ i for i in r if current[i]==first[i]]
print [first[i] for i in r]
result
[4, 8]
.
Comparing execution's times:
from time import clock
li = [[1,2,3,4,5,6,7,8,9,10],
[9,8,8,4,5,6,5,8,9,13],
[5,6,7,4,9,9,9,8,9,12],
[0,0,1,4,7,6,3,8,9,5]]
n = 10000
te = clock()
for turn in xrange(n):
first = li[0]
r = range(len(first))
for current in li[1:]:
r = [ i for i in r if current[i]==first[i]]
x = [first[i] for i in r]
t1 = clock()-te
print 't1 =',t1
print x
te = clock()
for turn in xrange(n):
y = [j[0] for i, j in enumerate(zip(*li)) if all(j[0]==k for k in j[1:])]
t2 = clock()-te
print 't2 =',t2
print y
print 't2/t1 =',t2/t1
print
result
t1 = 0.176347273187
[4, 8, 9]
t2 = 0.579408755442
[4, 8, 9]
t2/t1 = 3.28561221827
.
With
li = [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,2,22,26,24,25],
[9,8,8,4,5,6,5,8,9,13,18,12,15,14,15,15,4,16,19,20,2,158,35,24,13],
[5,6,7,4,9,9,9,8,9,12,45,12,4,19,15,20,24,18,19,20,2,58,23,24,25],
[0,0,1,4,7,6,3,8,9,5,12,12,12,15,15,15,5,3,14,20,9,18,28,24,14]]
result
t1 = 0.343173188632
[4, 8, 9, 12, 15, 20, 24]
t2 = 1.21259110432
[4, 8, 9, 12, 15, 20, 24]
t2/t1 = 3.53346690385
精彩评论