I need to download a CSV file, which works fine in browsers using:
http://www.ftse.com/objects/csv_to_csv.jsp?infoCode=100a&theseFilters=&csvAll=&theseColumns=Mw==&theseTitles=&tableTitle=FTSE%20100%20Index%20Constituents&dl=&p_encoded=1&e=.csv
The following code works for any other file (url) (with a fully qualified path), however with the above URL is开发者_运维知识库 downloads 800 bytes of gibberish.
def getFile(self,URL):
proxy_support = urllib2.ProxyHandler({'http': 'http://proxy.REMOVED.com:8080/'})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)
response = urllib2.urlopen(URL)
print response.geturl()
newfile = response.read()
output = open("testFile.csv",'wb')
output.write(newfile)
output.close()
urllib2 uses httplib under the hood, so the best way to diagnose this is to turn on http connection debugging. Add this code before you access the url and you should get a nice summary of exactly what http traffic is being generated:
import httplib
httplib.HTTPConnection.debuglevel = 1
精彩评论