I am using the following code to read an XML file and write it to an XML output file using the SAX Parser. However, the output file is missing the CDATA directives. The contents of the CDATA section are there all right, but the starting <![CDATA[
and the closing ]]>
are not present in the output file!
from xml.sax import make_parser
from xml.sax.handler import ContentHandler
import sys
class XMLWriter():
def __init__ (self, xWriter):
self.xWriter = xWriter
def startElement(self, name, attrs):
self.xWriter.write('<' + name)
for sAttribute in attrs.getNames():
self.xWriter.write(' %s="%s"' % (sAttribute, attrs.getValue(sAttribute)))
self.xWriter.write('>')
def characters (self, ch):
self.xWriter.write(ch)
def end开发者_如何学PythonElement(self, name):
self.xWriter.write('</'+ name + '>')
def processingInstruction(self, target, data):
return
def setDocumentLocator(self, dummy):
return
def startDocument(self):
return
def endDocument(self):
return
parser = make_parser()
curHandler = XMLWriter(open('test.out.xml', 'w'))
parser.setContentHandler(curHandler)
parser.parse(open('test.xml'))
What am I doing wrong?
CDATA is a convenience annotation used to include text containing markup. The fact that a text node was enclosed in CDATA in a particular serialization may be preserved by the parser or may be discarded. If your SAX parser has events for CDATA sections, you have to handle them and then re-wrap the text node in CDATA on the way out.
精彩评论