Im usin开发者_开发问答g the program from: here
to download many urls at once. It works fine, but the order of the urls in the queue that is received is not the same as their order in the urls list, and its also not constant (changes from run to run).
What can I do to either make their order constant or to know which url belongs to which index in the queue that is received.
Thanks.
Change fetch
to read like this:
def fetch(url):
return (url, urllib2.urlopen(url).read())
The, instead of a queue full of strings, each one containing a result, you get a queue full of tuples, each tuple containing the url, then a result.
You aren't going to be able to get back a queue in which things are always the same order because multithreading is not deterministic about stuff like that. So the best thing to do is make sure each thing is tagged so you can identify it later.
You can just add the index number to the URL...
urls = [
(0, 'http://www.google.com/'),
(1, 'http://www.lycos.com/'),
(2, 'http://www.bing.com/'),
(3, 'http://www.altavista.com/'),
(4, 'http://achewood.com/'),
]
def fetch(index, url):
data = urllib2.urlopen(url).read()
# ... do whatever you need using index ...
精彩评论