I'm trying to download multiple images concurrently using Py开发者_StackOverflowthon over the internet, and I've looked at several option but none of them seem satisfactory.
I've considered pyCurl, but don't really understand the example code, and it seems to be way overkill for a task as simple as this. urlgrabber seems to be a good choice, but the documentation says that the batch download feature is still in development. I can't find anything in the documentation for urllib2.
Is there an option that actually works and is simpler to implement? Thanks.
It's not fancy, but you can use urllib.urlretrieve
, and a pool of threads or processes running it.
Because they're waiting on network IO, you can get multiple threads running concurrently - stick the URLs and destination filenames in a Queue.Queue
, and have each thread suck them up.
If you use multiprocessing, it's even easier - just create a Pool
of processes, and call mypool.map
with the function and iterable of arguments. There isn't a thread pool in the standard library, but you can get a third party module if you need to avoid launching separate processes.
精彩评论