I'm trying to开发者_开发知识库 parse a file from the web on Android using the DOM method.
The code in question is:
try {
URL url = new URL("https://www.beatport.com/en-US/xml/content/home/detail/1/welcome_to_beatport");
InputSource is = new InputSource(url.openStream());
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(is);
document.getDocumentElement().normalize();
} catch(Exception e) {
Log.v(TAG, "Exception = " + e);
}
But I'm getting the following exception:
V/XMLParseTest1( 846):Exception = org.xml.sax.SAXParseException: name expected (position:START_TAG <null>@2:176 in java.io.InputStreamReader@43ea4538)
The file is being handed to me gzipped. I've checked the is
object in the debugger and its length is 6733 bytes (the same as the content length of the file in the response headers) however if I save the file to my harddrive from the browser it's size is 59114 bytes. Furthermore if I upload it to my own server which doesn't gzip XML-s when it serves them and set the URL the code runs just fine.
I'm guessing that what happens is that Android tries to parse the gzipped stream.
Is there a way to first unzip the stream? Any other ideas?
You can wrap the result of url.openStream()
in a GZIPInputStream. eg:
InputSource is = new InputSource(new GZIPInputStream(url.openStream()));
To auto-detect when to do this, use the Content-Encoding HTTP header. eg:
URLConnection connection = url.openConnection();
InputStream stream = connection.getInputStream();
if ("gzip".equals(connection.getContentEncoding())) {
stream = new GZIPInputStream(stream));
}
InputSource is = new InputSource(stream);
By default, this implementation of HttpURLConnection requests that servers use gzip compression. Since getContentLength() returns the number of bytes transmitted, you cannot use that method to predict how many bytes can be read from getInputStream(). Instead, read that stream until it is exhausted: when read() returns -1. Gzip compression can be disabled by setting the acceptable encodings in the request header:
urlConnection.setRequestProperty("Accept-Encoding", "identity");
so nothing need to do.
精彩评论