I'm trying to process XML using Python's minidom, and then output the result using toprettyxml(). I ran into two problems:
- There are added blank lines.
- There are added newlines and tabs for text nodes.
Here's the code and output:
$ cat test.py
from xml.dom import minidom
dom = minidom.parse("test.xml")
print dom.toprettyxml()
$ cat test.xml
<?xml version="1.0" encoding="UTF-8"?>
<store>
<product>
<fruit>orange</fruit>
</product>
</store>
$ python test.py
<?xml version="1.0" ?>
<store>
<product>
<fruit>
orange
</fruit>
</product>
</store>
I can workaround problem 1 using strip() to remove blank lines, and I can workaround problem 2 using the hack (fixed_writexml) d开发者_StackOverflow中文版escribed in this link: http://ronrothman.com/public/leftbraned/xml-dom-minidom-toprettyxml-and-silly-whitespace/, but I was wondering if there's a better solution since the hack is almost 3 years old now. I'm open to using something other than minidom, but I'd like to avoid adding external packages like lxml.
One solution is to patch minidom Library with the proposed patch to the bug you mention.
I haven't tested myself, a bit hacky too, so it may not suit you!
精彩评论