开发者

Android java XML junk after document element

开发者 https://www.devze.com 2023-02-03 17:45 出处:网络
I\'m using SAX to read/parse XML documents and I have it working fin开发者_开发技巧e except for this particular site where eclipse tells me \"junk after document element\" and I get no data returned

I'm using SAX to read/parse XML documents and I have it working fin开发者_开发技巧e except for this particular site where eclipse tells me "junk after document element" and I get no data returned

http://www.zachblume.com/apis/rhyme.php?format=xml&word=example

The site is not mine..just trying to get some data from it.


Yes, that's not an XML document. It's trying to include more than one root element:

<?xml version="1.0"?> 
<word>ampal</word> 
<word>ample</word> 
<word>hampel</word> 
<word>hample</word> 
<word>lampl</word> 
<word>pampel</word>
<word>sample</word>

The parser regards everything after <word>ampal</word> as by that time it's read a complete document... hence the complain about "junk after document element".

An XML document can only have one root, but several children within the root. For example:

<?xml version="1.0"?> 
<words>
  <word>ampal</word> 
  <word>ample</word> 
  <word>hampel</word> 
  <word>hample</word> 
  <word>lampl</word> 
  <word>pampel</word> 
  <word>sample</word>
</words>


The page does not contain XML. It contains an XML snippet at best:

<?xml version="1.0"?> 
<word>ampal</word> 
<word>ample</word> 
<word>hampel</word> 
<word>hample</word> 
<word>lampl</word> 
<word>pampel</word> 
<word>sample</word> 

This is incorrect since there is no document element. SAX interprets the first <word> as the document element, and correctly reports "junk after document element" since for all it knows, the document element ends on line 1.

To get around the error, do not treat this document as XML. Download it as text, remove the XML declaration (<?xml version="1.0"?>) and then wrap it in a fake document element before you try to process it.

0

精彩评论

暂无评论...
验证码 换一张
取 消