I want to get content of page from URL by this code :
public static String getContentResult(URL url) throws IOException{
InputStream in = url.openStream();
StringBuffer sb = new StringBuffer();
byte [] buffer = new byte[256];
while(true){
int byteRead = 开发者_Python百科in.read(buffer);
if(byteRead == -1)
break;
for(int i = 0; i < byteRead; i++){
sb.append((char)buffer[i]);
}
}
return sb.toString();
}
But with this URL : http://portal.acm.org/citation.cfm?id=152610.152611&coll=DL&dl=GUIDE&CFID=114782066&CFTOKEN=85539315 i can't get Asbtract :Database management systems will continue to manage.....
Can you give me solution for solve problem ? Thanks in advance
Outputting the header of of the get request:
HTTP/1.1 302 Moved Temporarily
Connection: close
Date: Thu, 18 Nov 2010 15:35:24 GMT
Server: Microsoft-IIS/6.0
location: http://portal.acm.org/citation.cfm?id=152610.152611&coll=DL&dl=GUIDE
Content-Type: text/html; charset=UTF-8
This means that the server wants you to download the new locations address. So either you get the header directly from the UrlConnection and follow that link or you use HttpClient automatically which automatically follow redirects. The code below is based on HttpClient:
public class HttpTest {
public static void main(String... args) throws Exception {
System.out.println(readPage(new URL("http://portal.acm.org/citation.cfm?id=152610.152611&coll=DL&dl=GUIDE&CFID=114782066&CFTOKEN=85539315")));
}
private static String readPage(URL url) throws Exception {
DefaultHttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(url.toURI());
HttpResponse response = client.execute(request);
Reader reader = null;
try {
reader = new InputStreamReader(response.getEntity().getContent());
StringBuffer sb = new StringBuffer();
{
int read;
char[] cbuf = new char[1024];
while ((read = reader.read(cbuf)) != -1)
sb.append(cbuf, 0, read);
}
return sb.toString();
} finally {
if (reader != null) {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}
There's no "Database management..." on given url. Perhaps, it's loaded by javascript dynamically. You'll need to have more sophisticated application to download such content ;)
The content you're looking for is not included in this URL. Open your browser and view the source code. Instead many javascript files are loaded. I think the content is fetched later by AJAX calls. You would need to learn how the content is loaded.
The Firfox Plugin Firebug could be helpful for a more detaild analyse.
The url that you should be using is:
http://portal.acm.org/citation.cfm?id=152610.152611&coll=DL&dl=GUIDE
Because the original url you posted (as mentioned by dacwe) sends redirect.
精彩评论