开发者

How to fix unclosed-tag XML in java

开发者 https://www.devze.com 2022-12-16 01:36 出处:网络
I\'m parsing XML in java using StaX, but my XML is not well-formed so the parser will throw error. In XML, there are unclosed-ta开发者_JAVA技巧gs

I'm parsing XML in java using StaX, but my XML is not well-formed so the parser will throw error. In XML, there are unclosed-ta开发者_JAVA技巧gs

for example :

<person>
  <name>John</name>
  <age>21
  ...
  ...
</person>

the <age> tag doesn't has closed tag </age>. So I need to fix the XML first..

how can I fix the XML to close the unclosed-tag?

is there a library to do this ? I've tried JTidy & HTMlCleaner, but I still can't figure out how to fix the XML. I need library in java, not stand alone app. Thanks


I don't think there is a ready made solution to fix XML. That's because it's impossible to know if

<person>
  <name>John</name>
  <age>21
  <birthDate>...</birthDate>
  ...
</person>

is to be

<person>
  <name>John</name>
  <age>21
  <birthDate>...</birthDate>
  </age>
  ...
</person>

or

<person>
  <name>John</name>
  <age>21</age>
  <birthDate>...</birthDate>
  ...
</person>

I think that kind of logic can only be dealt with a custom String parser, where you say how data is to be transformed.


Find the person who generated the XML and beat them senseless.

It's a basic point of XML that a document is always well-formed. This is very, very easy to do, equally easy to test, and it's a foundation stone for everything else. Is someone out there is writing code which can't even get that right, they don't deserve to be working as a programmer. Seriously, they should be flipping burgers or digging ditches instead.

Writing code to deal with their crappy code is not a good long-term solution. It doesn't do anything to address the problem of their crappy code.

I appreciate that this probably doesn't help much.


Instead of fixing the XML you can try to turn of validation with:

XMLInputFactory inputFactory = XMLInputFactory.newInstance();
inputFactory.setProperty(XMLInputFactory.IS_VALIDATING, false);
0

精彩评论

暂无评论...
验证码 换一张
取 消