开发者

python feedparser

开发者 https://www.devze.com 2023-02-11 09:09 出处:网络
How would you parse xml data as follows with python feedparser <Book_API> <Contributor_List>

How would you parse xml data as follows with python feedparser

<Book_API>
<Contributor_List>
<Display_Name>Jason</Display_Name>
</Contributor_List>
<Contributor_List>
<Display_N开发者_如何学Pythoname>John Smith</Display_Name>
</Contributor_List>
</Book_API>


That doesn't look like any sort of RSS/ATOM feed. I wouldn't use feedparser at all for that, I would use lxml. In fact, feedparser can't make any sense of it and drops the "Jason" contributor in your example.

from lxml import etree

data = <fetch the data somehow>
root = etree.parse(data)

Now you have a tree of xml objects. How to do it in lxml more specifically is impossible to say until you actually give valid XML data. ;)


As Lennart Regebro mentioned, it seems not a RSS/Atom feed but just XML document. There are several XML parsing facilities (SAX and DOM both) in Python standard libraries. I recommend you ElementTree. Also lxml is best one (which is drop-in replacement of ElementTree) in third party libraries.

try:
    from lxml import etree
except ImportError:
    try:
        from xml.etree.cElementTree as etree
    except ImportError:
        from xml.etree.ElementTree as etree

doc = """<Book_API>
<Contributor_List>
<Display_Name>Jason</Display_Name>
</Contributor_List>
<Contributor_List>
<Display_Name>John Smith</Display_Name>
</Contributor_List>
</Book_API>"""
xml_doc = etree.fromstring(doc)
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号