开发者

Resuming an s3 bucket listing via boto

开发者 https://www.devze.com 2023-03-29 03:36 出处:网络
I\'m iterating over 2 million objects thusly: - conn = boto.connect_s3(\'xxx\',\'xxx\') bucket = conn.lookup(\'bucket_name\')

I'm iterating over 2 million objects thusly: -

conn = boto.connect_s3('xxx','xxx')
bucket = conn.lookup('bucket_name')

for key in bucket.list():
  somefunction(key.name)

Say it fails at the millionth object, how would I go about resuming this开发者_高级运维 operation from that point?


I figured it out by looking at the boto source.

def list(self, prefix='', delimiter='', marker='', headers=None):

Passing key.name to marker will allow you to resume your operation from that point.


An example of resuming requests using the marker property.

This is also useful if you want to recurse through subtrees or have many millions of objects to crawl through and don't want them in a single list.

marker = None
while True:
    keys = bucket.get_all_keys(marker=marker)
    last_key = None

    for k in keys:
        # TODO Do something with your keys!
        last_key = k.name

    if not keys.is_truncated:
        break

    marker = last_key
0

精彩评论

暂无评论...
验证码 换一张
取 消