开发者

Make HttpURLConnection load web pages with images

开发者 https://www.devze.com 2023-03-12 01:10 出处:网络
Currently I\'m using HttpURLConnection for load remote web page and present to my clients (using InputStream 开发者_运维百科to HttpResponse\'s outputStream transfer), it loads html correctly but skips

Currently I'm using HttpURLConnection for load remote web page and present to my clients (using InputStream 开发者_运维百科to HttpResponse's outputStream transfer), it loads html correctly but skips images, how to fix it?

Thanks


You need to manipulate the HTML that way so that all resource URLs on the intranet domain are proxied as well. E.g. all of the following resource references in HTML

<base href="http://intranet.com/" />
<script src="http://intranet.com/script.js"></script>
<link href="http://intranet.com/style.css" />
<img src="http://intranet.com/image.png" />
<a href="http://intranet.com/page.html">link</a>

should be changed in the HTML that way so that they go through your proxy servlet instead, e.g.

<base href="http://example.com/proxy/" />
<script src="http://example.com/proxy/script.js"></script>
<link href="http://example.com/proxy/style.css" />
<img src="http://example.com/proxy/image.png" />
<a href="http://example.com/proxy/page.html">link</a>

A HTML parser like Jsoup is extremely helpful in this. You can do as follows in your proxy servlet which is, I assume, mapped on an URL pattern of /proxy/*.

String intranetURL = "http://intranet.com";
String internetURL = "http://example.com/proxy";

if (request.getRequestURI().endsWith(".html")) { // A HTML page is requested.
    Document document = Jsoup.connect(intranetURL + request.getPathInfo()).get();

    for (Element element : document.select("[href]")) {
        element.attr("href", element.absUrl("href").replaceFirst(intranetURL, internetURL));
    }

    for (Element element : document.select("[src]")) {
        element.attr("src", element.absUrl("src").replaceFirst(intranetURL, internetURL));
    }

    response.setContentType("text/html;charset=UTF-8");
    response.setCharacterEncoding("UTF-8");
    resposne.getWriter().write(document.html());
}
else { // Other resources like images, etc.
    URLConnection connection = new URL(intranetURL + request.getPathInfo()).openConnection();

    for (Map.Entry<String, List<String>> header : connection.getHeaderFields().entrySet()) {
        for (String value : header.getValue()) {
            response.addHeader(header.getKey(), value);
        }
    }

    InputStream input = connection.getInputStream();
    OutputStream output = response.getOutputStream();
    // Now just copy input to output.
}


You have to make a separate request for each image. That's what browsers do as well.

0

精彩评论

暂无评论...
验证码 换一张
取 消