开发者

Parse XML ampersand in Java

开发者 https://www.devze.com 2023-03-13 18:20 出处:网络
I download an XML-file, I generate using PHP, that looks similar to this <?xml version=\"1.0\" encoding=\"utf-8\" ?>

I download an XML-file, I generate using PHP, that looks similar to this

<?xml version="1.0" encoding="utf-8" ?> 
<customersXML> 
   ...
   <customer id="12" name="Me+%26+My+Brother" swid="1" /> 
   ...
</customersXML> 

Now I need to parse it in Java, but before that I use URL-Decode, so the XML become this

<?xml version="1.0" encoding="utf-8" ?> 
<customersXML> 
   ...
   <customer id="12" name="Me & My Brother" swid="1" /> 
   ...
</customersXML> 

But when I parse the XML-file using SAX, I get a problem wit开发者_开发百科h "&". How can I get around this?


The ampersand is a special character in xml (O'reilly Xml: Entities: Handling Special Content) and needs to be encoded. Replace it with &amp; before sending it.


If the XML in question isn't urlencoded in the first place (which it doesn't look like it is), then you shouldn't be urldecoding it. Breaking the xml and then "unbreaking" it really doesn't seem like the best way to go about it. Just use the original xml and parse that.


Never process XML as a string without parsing it, or you are liable to end up with something that is no longer XML. As you have discovered.


You should FIRST parse, THEN url decode.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号