开发者

what's the best way to disguise a urllib2 query as a human request (beyond user-agent)?

开发者 https://www.devze.com 2023-03-11 18:34 出处:网络
What\'s the best way to disguise a Python program using urllib2?I know how to set the user-agent which is a good start.But what about other items like the referring URL?Any way to set that?Any other s

What's the best way to disguise a Python program using urllib2? I know how to set the user-agent which is a good start. But what about other items like the referring URL? Any way to set that? Any other suggestions?

Here's what I'm using to add the user-agent:

opener = urllib2.build_opener()
opener.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64; rv:2.0.1) Gecko/20110开发者_JAVA百科506 Firefox/4.0.1')]
f = opener.open("http://www.domain.com")


There are a lot of specifics that you haven't mentioned. To find your answers, simply go to www.domain.com in your favorite browser (with good dev tools), and examine the network traffic.

Chrome has built in tools. Firebug is for firefox.

Look at all of the headers that were sent, and replicate as per your specific need.


You could point a real browser to a tool like this one or this one, see the exact contents of all fields that your real browser sends, and mimic that in your Python script.

0

精彩评论

暂无评论...
验证码 换一张
取 消