开发者

retrieving 'nulls' from website using Java URL input stream

开发者 https://www.devze.com 2023-01-01 10:46 出处:网络
I\'m trying to read the text from a website using the Java URL input stream: URL u = new URL(str); br3 = new BufferedReader(new InputStreamReader(u.openStream()));

I'm trying to read the text from a website using the Java URL input stream:

URL u = new URL(str);
br3 = new BufferedReader(new InputStreamReader(u.openStream()));
while(true)  
 System.out.println(br3.readLine());

This seems to work fine for most websites, but for some URL shortening services like LinkBee, the object draws a blank. e.g. linkbee.com/FUAKF. I can view the source code using an explorer, however I repeatedl开发者_开发百科y get nulls when I use the above code.


It's because those sites are just redirection services. How are you handling redirects? (a redirect has a Location: header, but no body)


use a http library like commons:httpclient, the method getResponseBodyAsStream follows redirects automatically


Barry is correct.

I just wanted to add that for certain websites there also could be javascript that could redirect you to a different page. Something like this:

<script type="text/javascript"> <!-- window.location = "http://www.google.com/" //--> </script>

But in your situation it would be the headers redirecting you based on the fact you are getting nulls back. Just thought you might want to watch out for the javascript thing too.


It's true that it is a redirection service, however I do not require actually following the redirection, I merely need to extract the URL that it redirects to - which can found within the source code of the redirection website itself (which in the given case, is at line 81:

input type='hidden' id='urlholder' value='http://www.megaupload.com/?d=02EBRUTT'

Regardless, I don't think the stream should be giving me a complete blank unless it doesn't read head, only body?

0

精彩评论

暂无评论...
验证码 换一张
取 消