开发者

question regarding universal feed parser

开发者 https://www.devze.com 2023-01-04 02:47 出处:网络
I faced a开发者_Python百科 problem grabbing the content from a couple of blog feeds I have crawled.

I faced a开发者_Python百科 problem grabbing the content from a couple of blog feeds I have crawled.

I'm uncertain what is the reason, but by parsing one or two blogs with the feedparser returns me this particular error:

results = feedparser.parse(url)

  ent = []

  for entry in results.entries:
     e = {}
     e['title'] = entry.title
     e['content'] = entry.content[0].value

object has no attribute 'content'

or

object has no attribute 'link'

This hasn't been the case for the rest of my other blogs. Does empty entry content results in this?


There is a mapping between the XML tags used in the feed and the attributes available on the entries in feedparser. View the source of one of the feeds that has been causing the problem and see what tags it uses. You might find it doesn't include content for the entries or that the links are in a field like uid rather than link.

You will then need to write your code to handle the slight variations, either by using try/catch or checking for specific attributes with hasattr.

If you post a link to one of the feeds in question I might be able to offer some more advice.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号