At the code I'm writing I need to intersect two horizontal list like:
listA:
chr1 aatt
chr开发者_高级运维8 tagg
chr11 aaaa
chr7 gtag
listB
chr8 tagt
chr1 tttt
chr7 gtag
chr11 aaaa
chr9 atat
#This lists are compounded by one str per line, wich it has a "/t" in the middle.
#Also note that are in different order
How can I get the intersection between this two list?
desired result:
chr7 gtag
chr11 aaaa
I'm also available to generate lists of two string per line, like this:\
listA:
('chr1', 'aatt')
('chr8', 'tagg')
('chr11', 'aaaa')
('chr7', 'gtag')
listB
('chr8', 'tagt')
('chr1', 'tttt')
('chr7', 'gtag')
('chr11','aaaa')
('chr9', 'atat')
The important matter in this case is that the two columns must be treated as one
thanks for your time!
Convert to sets and intersect: set(a) & set(b)
Use Python sets
listA = (
('chr1', 'aatt'),
('chr8', 'tagg'),
('chr11', 'aaaa'),
('chr7', 'gtag'),
)
listB = (
('chr8', 'tagt'),
('chr1', 'tttt'),
('chr7', 'gtag'),
('chr11','aaaa'),
('chr9', 'atat'),
)
combined = set(listA).intersection(set(listB))
for c, d in combined:
print c, d
You can also use the &
like this:
combined = set(listA) & set(listB)
Use set intersection.
setC = set(listA) & set(listB)
listC = list(setC) # if you really need a list
import numpy as np
np.intersect_nu(list1, list2)
Perhaps there is a performance optimization by not creating 2 sets from lists, which requires hashing all the items in the list, but creating only 1 set and iterating through the second list. If you know which list is large and which is small that could also help.
def intersect(smallList, largeList):
values = set(smallList)
intersection = []
for v in largeList:
if v in values:
intersection.append(v)
return intersection
精彩评论