开发者

Using mimetools.Message in urllib2.urlopen

开发者 https://www.devze.com 2023-03-06 01:23 出处:网络
I\'m using urllib2.urlopen(): req = urllib2.Request(\'http://www.google.com\') resp = urllib2.urlopen(req)

I'm using urllib2.urlopen():

req = urllib2.Request('http://www.google.com')
resp = urllib2.urlopen(req)
print resp.info()

print resp.info(开发者_运维知识库)['set-cookie']

Date: Sat, 14 May 2011 01:24:12 GMT

Expires: -1

Cache-Control: private, max-age=0

Content-Type: text/html; charset=ISO-8859-1

Set-Cookie: PREF=ID=5ec78624283cc050:FF=0:TM=1305336252:LM=1305336252:S=eRXgUUuzhQbRmZxk; expires=Mon, 13-May-2013 01:24:12 GMT; path=/; domain=.google.com

Set-Cookie: NID=46=GxyZVeWbT9dn0sLa9waPGSusm1hFqGf46SPqewahg0bzbYIQX0oHff0bzJ33E2yO89npEsYkqSoX0HLSqHbCxj5tCK2E931PfEJbqDMB6lTDk4ngVAiiyObWmbHgRUC9; expires=Sun, 13-Nov-2011 01:24:12 GMT; path=/; domain=.google.com; HttpOnly

Server: gws

X-XSS-Protection: 1; mod


PREF=ID=5ec78624283cc050:FF=0:TM=1305336252:LM=1305336252:S=eRXgUUuzhQbRmZxk; expires=Mon, 13-May-2013 01:24:12 GMT; path=/; domain=.google.com, NID=46=GxyZVeWbT9dn0sLa9waPGSusm1hFqGf46SPqewahg0bzbYIQX0oHff0bzJ33E2yO89npEsYkqSoX0HLSqHbCxj5tCK2E931PfEJbqDMB6lTDk4ngVAiiyObWmbHgRUC9; expires=Sun, 13-Nov-2011 01:24:12 GMT; path=/; domain=.google.com; HttpOnly

As you can see in the headers received in the response, there are TWO statements of 'set-cookie', HOWEVER in the resp.info() object I receive it has grouped both cookie statements together and separates them by a ',' (comma)

This is troublesome to separate the cookies by this delimiter since there are commas inside the cookie information i'm try to separate with this comma delimiter

Is there an easy way to call upon each cookie string individually with this mimetools.message object? (resp.info())

else-> I'll just have to parse the headers manually without this not so helpful mimetools.message/dictionary object


Try using getheaders() to get a list of the cookies:

>>> msg = resp.info()
>>> msg.getheaders('Set-Cookie')
['PREF=ID=5975a5ee255f0949:FF=0:TM=1305336283:LM=1305336283:S=1vkES6eF4Yxd-_oM; expires=Mon, 13-May-2013 01:24:43 GMT; path=/; domain=.google.com.au', 'NID=46=lQVFZg6yKUsoWT529Hqp5gA8B_CKYd2epPIbANmw_J0UzeMt2BhuMF-gtmGsRhenUTeajKz2zILXd9xWpHWT8ZGvDcmNdkzaGX-L_-sKyY1w4e2l3DKd80JzSkt2Vp-H; expires=Sun, 13-Nov-2011 01:24:43 GMT; path=/; domain=.google.com.au; HttpOnly']

In this case, you get a list of two strings.

Then you can iterate over that list and grab whichever cookie you like. str.startswith() is your friend:

>>> cookies = msg.getheaders('Set-Cookie')
>>> for cookie in cookies:
...     if cookie.startswith('PREF='):
...             print 'Got PREF: ', cookie
...     else:
...             print 'Got another: ', cookie
... 
Got PREF:  PREF=ID=5975a5ee255f0949:FF=0:TM=1305336283:LM=1305336283:S=1vkES6eF4Yxd-_oM; expires=Mon, 13-May-2013 01:24:43 GMT; path=/; domain=.google.com.au
Got another:  NID=46=lQVFZg6yKUsoWT529Hqp5gA8B_CKYd2epPIbANmw_J0UzeMt2BhuMF-gtmGsRhenUTeajKz2zILXd9xWpHWT8ZGvDcmNdkzaGX-L_-sKyY1w4e2l3DKd80JzSkt2Vp-H; expires=Sun, 13-Nov-2011 01:24:43 GMT; path=/; domain=.google.com.au; HttpOnly

How a newbie can find the documentation in Python

% python
Python 2.7.1 (r271:86832, Jan 29 2011, 13:30:16) 
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> req = urllib2.Request('http://www.google.com')
>>> resp = urllib2.urlopen(req)
>>> help(resp.info())
0

精彩评论

暂无评论...
验证码 换一张
取 消