Currently I'm using HttpURLConnection for load remote web page and present to my clients (using InputStream 开发者_运维百科to HttpResponse's outputStream transfer), it loads html correctly but skips images, how to fix it?
Thanks
You need to manipulate the HTML that way so that all resource URLs on the intranet domain are proxied as well. E.g. all of the following resource references in HTML
<base href="http://intranet.com/" />
<script src="http://intranet.com/script.js"></script>
<link href="http://intranet.com/style.css" />
<img src="http://intranet.com/image.png" />
<a href="http://intranet.com/page.html">link</a>
should be changed in the HTML that way so that they go through your proxy servlet instead, e.g.
<base href="http://example.com/proxy/" />
<script src="http://example.com/proxy/script.js"></script>
<link href="http://example.com/proxy/style.css" />
<img src="http://example.com/proxy/image.png" />
<a href="http://example.com/proxy/page.html">link</a>
A HTML parser like Jsoup is extremely helpful in this. You can do as follows in your proxy servlet which is, I assume, mapped on an URL pattern of /proxy/*
.
String intranetURL = "http://intranet.com";
String internetURL = "http://example.com/proxy";
if (request.getRequestURI().endsWith(".html")) { // A HTML page is requested.
Document document = Jsoup.connect(intranetURL + request.getPathInfo()).get();
for (Element element : document.select("[href]")) {
element.attr("href", element.absUrl("href").replaceFirst(intranetURL, internetURL));
}
for (Element element : document.select("[src]")) {
element.attr("src", element.absUrl("src").replaceFirst(intranetURL, internetURL));
}
response.setContentType("text/html;charset=UTF-8");
response.setCharacterEncoding("UTF-8");
resposne.getWriter().write(document.html());
}
else { // Other resources like images, etc.
URLConnection connection = new URL(intranetURL + request.getPathInfo()).openConnection();
for (Map.Entry<String, List<String>> header : connection.getHeaderFields().entrySet()) {
for (String value : header.getValue()) {
response.addHeader(header.getKey(), value);
}
}
InputStream input = connection.getInputStream();
OutputStream output = response.getOutputStream();
// Now just copy input to output.
}
You have to make a separate request for each image. That's what browsers do as well.
精彩评论