I'm parsing XML in java using StaX, but my XML is not well-formed so the parser will throw error. In XML, there are unclosed-ta开发者_JAVA技巧gs
for example :
<person>
<name>John</name>
<age>21
...
...
</person>
the <age>
tag doesn't has closed tag </age>
. So I need to fix the XML first..
how can I fix the XML to close the unclosed-tag?
is there a library to do this ? I've tried JTidy & HTMlCleaner, but I still can't figure out how to fix the XML. I need library in java, not stand alone app. Thanks
I don't think there is a ready made solution to fix XML. That's because it's impossible to know if
<person>
<name>John</name>
<age>21
<birthDate>...</birthDate>
...
</person>
is to be
<person>
<name>John</name>
<age>21
<birthDate>...</birthDate>
</age>
...
</person>
or
<person>
<name>John</name>
<age>21</age>
<birthDate>...</birthDate>
...
</person>
I think that kind of logic can only be dealt with a custom String parser, where you say how data is to be transformed.
Find the person who generated the XML and beat them senseless.
It's a basic point of XML that a document is always well-formed. This is very, very easy to do, equally easy to test, and it's a foundation stone for everything else. Is someone out there is writing code which can't even get that right, they don't deserve to be working as a programmer. Seriously, they should be flipping burgers or digging ditches instead.
Writing code to deal with their crappy code is not a good long-term solution. It doesn't do anything to address the problem of their crappy code.
I appreciate that this probably doesn't help much.
Instead of fixing the XML you can try to turn of validation with:
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
inputFactory.setProperty(XMLInputFactory.IS_VALIDATING, false);
精彩评论