开发者

Is there a way to get SAXParser to ignore content of certain elements when parsing?

开发者 https://www.devze.com 2023-02-13 16:23 出处:网络
I have XML of the format: ... <To>\"Paul McCartney\" <paul.mccartney@hotmail.com></To>

I have XML of the format:

...
<To>"Paul McCartney" <paul.mccartney@hotmail.com></To>
<From>"John Lennon" <john.lennon@yahoo.com></From>
...

The SAXParser throws an exception as soon as it gets to t开发者_如何学Gohe email addresses. It thinks <paul.mccartney@hotmail.com> is an XML element and throws and exception as soon as it encounters the @ symbol. Is there anyway to ignore content of certain elements in Java SAX?


You could try by overriding org.xml.sax.helpers.DefaultHandler.error(), and similar methods, if you're using the DefaultHandler? See also the JavaDoc of org.xml.sax.ErrorHandler:

http://download.oracle.com/javase/6/docs/api/org/xml/sax/ErrorHandler.html

But in any way, the XML is invalid. It shouldn't be that way. You could preprocess it and replace < by &lt; and > by &gt; or just wrap the whole <To/> and <From/> content into a <![CDATA[ ]]> block...


It's not XML so an XML parser won't parse it, if you can get the format changed otherwise your best bet it to roll you own parser specific to this format.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号