I'm trying to parse an XML file with BeautifulSoup. In all tutorials on the net, the content of the xml is given like
xml = "<doc><tag1>Contents 1<tag2>Contents 2<tag1>Contents 3"
soup = BeautifulStoneSoup(xml)
but I want to give only xml file'开发者_Go百科s path. In mechanize one can use get_data() method but it only works for html files. Any sugestions ?
The BeautifulSoup documentation says that:
"A Beautiful Soup constructor takes an XML or HTML document in the form of a string (or an open file-like object). It parses the document and creates a corresponding data structure in memory."
In the formulation of your question, you use BeautifulStoneSoup
, and allthough the online documentation uses strings, the docstring for the constructor reveals that:
"The Soup object is initialized as the 'root tag', and the provided markup (which can be a string or a file-like object) is fed into the underlying parser."
精彩评论