开发者

Comments in XML at beginning of document

开发者 https://www.devze.com 2023-01-02 22:30 出处:网络
my PYTHON xml parser fails if there´s a comment at the beginnging of an xml file like:: <?xml version=\"1.0\" encoding=\"utf-8\"?>

my PYTHON xml parser fails if there´s a comment at the beginnging of an xml file like::

<?xml version="1.0" encoding="utf-8"?>
<!-- Script version: "1"-->
<!-- Date: "07052010"-->
<component name="abc">
<pp>
    ....
</pp>
</component>

is it illegal to place a comment like this?

EDIT:

well it´s not throwing an error but the DOM module will fail and not recognize the child nodes:

import xml.dom.minidom as dom
sub_tree = dom.parse('xyz.xml')
for component in sub_tree.firstChild.childNodes:
    print(component)

I cannot acces the child nodes; sub_tree.firstChild.childNode开发者_如何学Gos returns an empty list,but if I remove those 2 comments I can loop through the list and read the childnodes as usual!

EDIT:

Guys, this simple example is working and enough to figure it out. start your python shell and execute this small code above. Once it will output nothing and after deleting the comments it will show up the node!


If you do this:

import xml.dom.minidom as dom
sub_tree = dom.parse('xyz.xml')
print sub_tree.children

You will see what is your problem:

>>> print sub_tree.childNodes
[<DOM Comment node " Script ve...">, <DOM Comment node " Date: "07...">, <DOM Element: component at 0x7fecf88c>]

firstChild will obviously pick up the first child, which is a comment and doesn't have any children of its own. You could iterate over the children and skip all comment nodes.

Or you could ditch the DOM model and use ElementTree, which is so much nicer to work with. :)


It is legal; from XML 1.0 Reference:

2.5 Comments

[Definition: Comments may appear anywhere in a document outside other markup; in addition, they may appear within the document type declaration at places allowed by the grammar. They are not part of the document's character data; an XML processor MAY, but need not, make it possible for an application to retrieve the text of comments. For compatibility, the string " -- " (double-hyphen) MUST NOT occur within comments.] Parameter entity references MUST NOT be recognized within comments.


To get better answers, show us (a) a small complete Python script and (b) a small complete XML document that together demonstrate the unexpected behaviour.

Have you considered using ElementTree?


That should be legal as long as the XML declaration is on the first line.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号