I have a schema which I use XmlBeans to umarshall to Java objects. I have no control over the the data that comes through.
One such field looks like <Name>Barnes & Noble</Name>
.
Parsing fails at th开发者_如何学Goe character &
with lexical error. Is there a way to specify an option while parsing XML files to ignore some special characters?
Any help you could provide will be great.
No. This is invalid XML. Ampersand must be escaped into "&".
You can manually escape all ampersand before parsing it as XML but that may mess up other XML entities.
You can parse <Name>Barnes & Noble</Name>
as XPL and then feed it into any XML process. XPL is just like XML except that it allows XML's special characters in text elements.
You can use XmlOptionCharEscapeMap
.
From the javadocs:
This class is used to set up a map containing characters to be escaped. Characters can be escaped as hex, decimal or as a predefined entity (this latter option applies only to the 5 characters defined as predefined entities in the XML Spec).
For example:
XmlOptionCharEscapeMap escapes = new XmlOptionCharEscapeMap(); escapes.addMapping('A', XmlOptionCharEscapeMap.HEXADECIMAL); escapes.addMapping('B', XmlOptionCharEscapeMap.DECIMAL); escapes.addMapping('>', XmlOptionCharEscapeMap.PREDEF_ENTITY); XmlOptions opts = new XmlOptions(); opts.setSaveSubstituteCharacters(escapes);
精彩评论