set timeout to http response read method in python_问答_开发者

set timeout to http response read method in python

开发者 https://www.devze.com 2022-12-25 07:54 出处：网络

I\'m building a download manager in python for fun, and sometimes the connection to the server is still on but the server doesn\'t send me data, so read method (of HTTPResponse) block me forever. This

I'm building a download manager in python for fun, and sometimes the connection to the server is still on but the server doesn't send me data, so read method (of HTTPResponse) block me forever. This happens, fo开发者_StackOverflow中文版r example, when I download from a server, which located outside of my country, that limit the bandwidth to other countries.

How can I set a timeout for the read method (2 minutes for example)?

Thanks, Nir.

If you're stuck on some Python version < 2.6, one (imperfect but usable) approach is to do

import socket
socket.setdefaulttimeout(10.0)  # or whatever

before you start using httplib. The docs are here, and clearly state that setdefaulttimeout is available since Python 2.3 -- every socket made from the time you do this call, to the time you call the same function again, will use that timeout of 10 seconds. You can use getdefaulttimeout before setting a new timeout, if you want to save the previous timeout (including none) so that you can restore it later (with another setdefaulttimeout).

These functions and idioms are quite useful whenever you need to use some older higher-level library which uses Python sockets but doesn't give you a good way to set timeouts (of course it's better to use updated higher-level libraries, e.g. the httplib version that comes with 2.6 or the third-party httplib2 in this case, but that's not always feasible, and playing with the default timeout setting can be a good workaround).

You have to set it during HTTPConnection initialization.

Note: in case you are using an older version of Python, then you can install httplib2; by many, it is considered a superior alternative to httplib, and it does supports timeout.
I've never used it, though, and I'm just reporting what documentation and blogs are saying.

Setting the default timeout might abort a download early if it's large, as opposed to only aborting if it stops receiving data for the timeout value. HTTPlib2 is probably the way to go.

5 years later but hopefully this will help someone else...

I was wrecking my brain trying to figure this out. My problem was a server returning corrupt content and thus giving back less data than it thought it had.

I came up with a nasty solution that seems to be working properly. Here it goes:

# NOTE I directly disabling blocking is not necessary but it represents
# an important piece to the problem so I am leaving it here.
# http_response.fp._sock.socket.setblocking(0)
http_response.fp._sock.settimeout(read_timeout)
http_response.read(chunk_size)

NOTE This solution also works for ~~the python requests~~ ANY library that implements the normal python sockets (which should be all of them?). You just have to go a few levels deeper:

resp.raw._fp.fp._sock.socket.setblocking()
resp.raw._fp.fp._sock.settimeout(read_timeout)
resp.raw.read(chunk_size)

As of this writing, I have not tried the following but in theory it should work:

resp = requests.get(some_url, stream=True)
resp.raw._fp.fp._sock.socket.setblocking()
resp.raw._fp.fp._sock.settimeout(read_timeout)
for chunk in resp.iter_content(chunk_size):
      # do stuff

Explanation

I stumbled upon this approach when reading this SO question for setting a timeout on socket.recv

At the end of the day, any http request has a socket. For the httplib that socket is located at resp.raw._fp.fp._sock.socket. The resp.raw._fp.fp._sock is a socket._fileobj (which I honestly didn't look far into) and I imagine it's settimeout method internally sets it on the socket attribute.