Using an HTTP PROXY - Python [duplicate]_问答_开发者

Using an HTTP PROXY - Python [duplicate]

开发者 https://www.devze.com 2023-02-23 22:51 出处：网络

This question already has answers here: Proxy with urllib2 (7 answers) Closed 7 years ago. I familiar with the fact that I should set the HTTP_RPOXY environment variable to the proxy addr

This question already has answers here: Proxy with urllib2 (7 answers) Closed 7 years ago.

I familiar with the fact that I should set the HTTP_RPOXY environment variable to the proxy address.

Generally urllib works fine, the problem is dealing with urllib2.

>>> urllib2.urlopen("http://www.google.com").read()

returns

urllib2.URLError: <urlopen error [Errno 10061] No connection could be made because the target machine actively refused it>

urllib2.URLErro开发者_运维知识库r: <urlopen error [Errno 11004] getaddrinfo failed>

Extra info:

urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...

I tried @Fenikso answer but I'm getting this error now:

URLError: <urlopen error [Errno 10060] A connection attempt failed because the 
connected party did not properly respond after a period of time, or established
connection failed because connected host has failed to respond>

Any ideas?

You can do it even without the HTTP_PROXY environment variable. Try this sample:

import urllib2

proxy_support = urllib2.ProxyHandler({"http":"http://61.233.25.166:80"})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)

html = urllib2.urlopen("http://www.google.com").read()
print html

In your case it really seems that the proxy server is refusing the connection.

Something more to try:

import urllib2

#proxy = "61.233.25.166:80"
proxy = "YOUR_PROXY_GOES_HERE"

proxies = {"http":"http://%s" % proxy}
url = "http://www.google.com/search?q=test"
headers={'User-agent' : 'Mozilla/5.0'}

proxy_support = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1))
urllib2.install_opener(opener)

req = urllib2.Request(url, None, headers)
html = urllib2.urlopen(req).read()
print html

Edit 2014: This seems to be a popular question / answer. However today I would use third party requests module instead.

For one request just do:

import requests

r = requests.get("http://www.google.com", 
                 proxies={"http": "http://61.233.25.166:80"})
print(r.text)

For multiple requests use Session object so you do not have to add proxies parameter in all your requests:

import requests

s = requests.Session()
s.proxies = {"http": "http://61.233.25.166:80"}

r = s.get("http://www.google.com")
print(r.text)

I recommend you just use the requests module.

It is much easier than the built in http clients: http://docs.python-requests.org/en/latest/index.html

Sample usage:

r = requests.get('http://www.thepage.com', proxies={"http":"http://myproxy:3129"})
thedata = r.content

Just wanted to mention, that you also may have to set the https_proxy OS environment variable in case https URLs need to be accessed. In my case it was not obvious to me and I tried for hours to discover this.

My use case: Win 7, jython-standalone-2.5.3.jar, setuptools installation via ez_setup.py

Python 3:

import urllib.request

htmlsource = urllib.request.FancyURLopener({"http":"http://127.0.0.1:8080"}).open(url).read().decode("utf-8")

I encountered this on jython client. The server was only talking TLS and the client using SSL context.

javax.net.ssl.SSLContext.getInstance("SSL")

Once the client was to TLS, things started working.

Using an HTTP PROXY - Python [duplicate]

Extra info:

urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...

精彩评论

关注公众号

热门标签

图文推荐

Using an HTTP PROXY - Python [duplicate]

Extra info:

urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：