开发者

access to google with python

开发者 https://www.devze.com 2023-01-19 22:02 出处:网络
how i can access to google !! i had try开发者_Python百科 that code urllib.urlopen(\'http://www.google.com\')

how i can access to google !!

i had try开发者_Python百科 that code

urllib.urlopen('http://www.google.com')

but it's show message prove you are human or some think like dat

some people say try user agent !! i dunno !


You should use the Google API for accessing the search. Here's an example for python. Unutbu provided a link to an older SO answer which contains a corrected version of the same example code.

#!/usr/bin/python
import urllib, urllib2
import json

api_key, userip = None, None
query = {'q' : 'search google python api'}
referrer = "https://stackoverflow.com/q/3900610"

if userip:
    query.update(userip=userip)
if api_key:
    query.update(key=api_key)

url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % (
    urllib.urlencode(query))

request = urllib2.Request(url, headers=dict(Referer=referrer))
json = json.load(urllib2.urlopen(request))

results = json['responseData']['results']
for r in results:
  print r['title'] + ": " + r['url']


A user agent string is indeed the way to go... pick any valid user agent from any common browser. In python 2.x, the following code should give you what you want:

import urllib2
r = urllib2.Request('http://www.google.com/')
r.add_header('User-Agent', 
             'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.19) '
             'Gecko/20081202 Firefox (Debian-2.0.0.19-0etch1)')
html = urllib2.urlopen(r).read()

Having said that, unutbu's recommendation to use the google search API (if you're looking to do searches) is by far the better way to go... avoids all that messy HTML parsing.

0

精彩评论

暂无评论...
验证码 换一张
取 消