开发者

Efficient way of Element lookup in a Python List?

开发者 https://www.devze.com 2023-01-23 13:02 出处:网络
I have a list of files in a directory. I have to process only certain files from that directory. filelist is my desired file-list. How do I go about achieving this? Not interested in a bash solution s

I have a list of files in a directory. I have to process only certain files from that directory. filelist is my desired file-list. How do I go about achieving this? Not interested in a bash solution since I have to do it all in this one Python script. Thanks much!

for record 开发者_StackOverflow社区in result:
    filelist.append(record[0])

print filelist


for file in os.listdir(sys.argv[1].strip() + "/"):
    for file in filelist: #This doesn't work, how else do I do this? If file equals to my desired file-list, then do something.
        print file

Sorry guys, not sure how I missed this! Early morning coding I guess!! Mods, please close it unless someone wants to chip in with an efficient way of doing it.

for file in os.listdir(sys.argv[1].strip() + "/"):
    if file in filelist:
        print file


If order and uniqueness don't matter, you can use a set intersection, which will be much more efficient.

import set
os_list = os.listdir(sys.argv[1].strip() + "/")
for file in set(os_list) & set(filelist):
    #...

Example of improvement:

import random
import timeit

l = [random.randint(1,10000) for i in range(1000)]
l2 = [random.randint(1,10000) for i in range(1000)]

def f1():
    l3 = []
    for i in l:
        if i in l2:
            l3.append(i)
    return l3

def f2():
    l3 = []
    for i in set(l) & set(l2):
        l3.append(i)
    return l3

t1 = timeit.Timer('f1()', 'from __main__ import f1')
print t1.timeit(100) #2.0850549985

t2 = timeit.Timer('f2()', 'from __main__ import f2')
print t2.timeit(100) #0.0162533142857


Sounds like you want to do a test:

for file in os.listdir(sys.argv[1].strip() + "/"):
    if file in filelist:
        # Found a file in the wanted-list.
        print file


It looks like you just want to do something like this:

for file in os.listdir(sys.argv[1].strip() + "/"): 
    if file in filelist:
        print file 

Note that I just changed the second for to an if. However, since you were asking about efficiency, you probably want to change filelist from being a list to being a set or a dict to make the in operator more efficient.


Something like this:

print [x for x in os.listdir(sys.argv[1].strip() + "/") if x in filelist]
0

精彩评论

暂无评论...
验证码 换一张
取 消