I am attempting to load each url in a sitemap.xml file in an effort to pre-cache them and speed up the users experience.
I have the following code which grabs the urls from the sitemap
$ch = curl_init();
/**
* Set the URL of the page or file to download.
*/
curl_setopt($ch, CURLOPT_URL, 'http://onlineservices.letterpart.com/sitemap.xml;jsessionid=1j1agloz5ke7l?id=1j1agloz5ke7l');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec ($ch);
curl_close ($ch);
$xml = new SimpleXMLElement($data);
foreach ($xml->url as $url_list) {
$url = $url_list->loc;
echo $url ."<br>";
}
and I am now trying to use fsockopen to load each url in turn.
where $url is in this format: http://onlineservices.letterpart.com:80/content/en/FAMILY-201103311115/Family_FLJONLINE_FLJ_2009_07_4
foreach ($xml->url as $url_list) {
$url = $url_list->loc;
$fp = fsockopen ($url,80);
if ($fp) {
fwrite($fp, "GET / HTTP/1.1\r\nHOST: $url\r\n\r\n");
while (!feof($fp)) {
print fread($fp,256);
}
fclose ($fp);
} else {
print "Fatal error\n";
}
}
But this is giving me this error for each url:
[12-M开发者_如何学运维ay-2011 13:34:09] PHP Warning: fsockopen() [function.fsockopen]: unable to connect to http://onlineservices.letterpart.com:80/content/en/FAMILY-201103311115/Family_FLJONLINE_FLJ_2009_07_4:-1 (Unable to find the socket transport "http" - did you forget to enable it when you configured PHP?) in /home/digital1/public_html/dev/sitemap.php on line 32
I have read that I need to: "just the hostname, not the URL in the fsockopen call. You'll need to provide the uri, minus the host/port in the actual HTTP headers"
so I tried this:
$fp = fsockopen ("http://onlineservices.letterpart.com",80);
if ($fp) {
fwrite($fp, "GET / HTTP/1.1\r\nHOST: content/en/FAMILY-201103311115/Family_FLJONLINE_FLJ_2009_07_4\r\n\r\n");
while (!feof($fp)) {
print fread($fp,256);
}
fclose ($fp);
} else {
print "Fatal error\n";
}
But I still get the same error.
EDIT:
If I change the fsockopen call to:
$fp = fsockopen ("onlineservices.letterpart.com",80);
then I get a slightly different and better but still wrong response. it seems to be ignoring the onlineservices.letterpart.com section and trying http:///content/ BUT... it has appended: /web/ui.xql?action=html&resource=login.html tot he end of the url which is our login page so it must be seeing our server...
HTTP/1.1 302 Moved Temporarily Date: Thu, 12 May 2011 14:40:02 GMT Server: Jetty/5.1.12 (Windows 2003/5.2 x86 java/1.6.0_07 Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: JSESSIONID=nh62zih3q8mf;Path=/ Location: http:///content/en/FAMILY-201103311115/Family_FLJONLINE_FLJ_2009_07_4/web/ui.xql?action=html&resource=login.html Content-Length: 0
Thanks.
fsockopen is not attented to be used for HTTP request, Curl is a better choice (and much more powerful).
There is also file_get_contents which can make it quick:
foreach ($xml->url as $url_list) {
$url = $url_list->loc;
file_get_contents($url);
}
Usefull for application cache warmup!
精彩评论