I am trying to load the html content using c# (To simulate php function - file_get_contents) using the following codes:
protected string file_get_contents(string fileName)
{
string sContents = string.Empty;
if (fileName.ToLower().IndexOf("http:") > -1)
{ // URL
System.Net.WebClient wc = new System.Net.WebClient();
byte[] response = wc.DownloadData(fileName);
sContents = System.Text.Encoding.ASCII.GetString(response);
} else {
// Regular Filename
System.IO.StreamReader sr = 开发者_JAVA百科new System.IO.StreamReader(fileName);
sContents = sr.ReadToEnd();
sr.Close();
}
return sContents;
}
However, this does not load the image in the html when rendering the content. But when use PHP file_get_content, it does load the image in the html when rendering the content.
Anyone has any idea?
Looking at the documentation for file_get_contents, there shouldn't be a difference between using a WebClient and this function to download the contents of an HTML page. In both cases, you'll get a string like this:
string html = @"<!DOCTYPE html>
<html>
<head>
<title>Hello World</title>
</head>
<body>
Image: <img src=""/image/test.png"">
</body>
</html>";
Of course, /image/test.png
is relative to the URL you downloaded the HTML page from.
So if you download the HTML page from http://www.example.com/test.html
and render the retrieved data at http://www.yourdomain.net/foo/bar/qux.html
the image tag will refer to http://www.yourdomain.net/image/test.png
rather than http://www.example.com/image/test.png
.
You need to rewrite relative URLs to absolute ones before rendering the HTML page. Have a look at the Html Agility Pack for this task.
精彩评论