I am trying to create a function which will extract meta keywords from a given URL and return it. However no matter what URLs I pass to it, it will always fail.
def GetKeywords(url):
soup = BeautifulSoup(url)
keywords = soup.findAll('meta', attrs={'name':re.compile("^keywords$", re.I)}) #Find all meta keywords on that page
if len(keywords) == 0: #Check to see if that page has any me开发者_StackOverflowta keywords to begin with
print "No meta keywords for: " + str(url)
return -1
else: #If so then return them
return keywords
Where does the BeautifulSoup state that it would accept and fetch an URL?
soup = BeautifulSoup(url)
Sorry but read the BeautifulSoup documentation first yourself instead trying and guessing API methods..
http://www.crummy.com/software/BeautifulSoup/documentation.html#Parsing a Document
What you want is likely using the urllib2 module of Python for fetching data yourself before feeding it into BeautifulSoup or you look at something like the scrapy module.
精彩评论