Reading HTTP server push streams with Python_问答_开发者

Reading HTTP server push streams with Python

开发者 https://www.devze.com 2022-12-28 01:06 出处：网络

I\'m playing around tryin开发者_Python百科g to write a client for a site which provides data as an HTTP stream (aka HTTP server push). However, urllib2.urlopen() grabs the stream in its current state

I'm playing around tryin开发者_Python百科g to write a client for a site which provides data as an HTTP stream (aka HTTP server push). However, urllib2.urlopen() grabs the stream in its current state and then closes the connection. I tried skipping urllib2 and using httplib directly, but this seems to have the same behaviour.

The request is a POST request with a set of five parameters. There are no cookies or authentication required, however.

Is there a way to get the stream to stay open, so it can be checked each program loop for new contents, rather than waiting for the whole thing to be redownloaded every few seconds, introducing lag?

You could try the requests lib.

import requests
r = requests.get('http://httpbin.org/stream/20', stream=True)

for line in r.iter_lines():
    # filter out keep-alive new lines
    if line:
        print line

You also could add parameters:

import requests
settings = { 'interval': '1000', 'count':'50' }
url = 'http://agent.mtconnect.org/sample'

r = requests.get(url, params=settings, stream=True)

for line in r.iter_lines():
    if line:
        print line

Do you need to actually parse the response headers, or are you mainly interested in the content? And is your HTTP request complex, making you set cookies and other headers, or will a very simple request suffice?

If you only care about the body of the HTTP response and don't have a very fancy request, you should consider simply using a socket connection:

import socket

SERVER_ADDR = ("example.com", 80)

sock = socket.create_connection(SERVER_ADDR)
f = sock.makefile("r+", bufsize=0)

f.write("GET / HTTP/1.0\r\n"
      + "Host: example.com\r\n"    # you can put other headers here too
      + "\r\n")

# skip headers
while f.readline() != "\r\n":
    pass

# keep reading forever
while True:
    line = f.readline()     # blocks until more data is available
    if not line:
        break               # we ran out of data!

    print line

sock.close()

One way to do it using urllib2 is (assuming this site also requires Basic Auth):

 import urllib2
 p_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
 url = 'http://streamingsite.com'
 p_mgr.add_password(None, url, 'login', 'password')

 auth = urllib2.HTTPBasicAuthHandler(p_mgr)
 opener = urllib2.build_opener(auth)

 urllib2.install_opener(opener)
 f = opener.open('http://streamingsite.com')

 while True:
     data = f.readline()