I am trying to write a threaded Python script which will iterate through a list of urls and open each one in a separate thread.
from BeautifulSoup import BeautifulSoup
from threading import Thread
import mechanize
tickers = ["aapl", "siri", "goog", "intc"]
nextTicker = 0
def quotes(i):
br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv: Gecko/20100914 Firefox/3.6.10')]
r= br.open('http://finance.yahoo.com/q?s=' + tickers[nextTicker])
html = r.read()
soup = BeautifulSoup(html)
price = soup.findAll('span', attrs={"id":"yfs_l10_" + tickers[nextTicker]})
price = price[0].string
print price
for i in range(4):
t = Thread(target=quotes, args=(i,))
I know th开发者_运维知识库at I need a nextTicker = nextTicker + 1
in there so that each thread will grab a unique ticker symbol from the list named tickers but I am not sure where to put this or how to ensure that each thread is getting a unique url.
Right now when the script runs it just grabs the index 0 item from the list for all four threads. How do I get each thread to grab the next item in the list and append it to my base url?
If you want thread specific data, pass it in the arguments.
So use tickers[i] instead of tickers[nextTicker]
Better yet, use
for ticker in tickers:
t = Thread(target = quotes, args = (ticker,) )
Possibly better yet, checkout eventlet. It allows writing code like this but avoids some of the problems with threads.
Instead of meddling with a nextTicker
variable and having to lock it and so forth, just refer to tickers[i]
. (Or even better, just pass the ticker itself!)