I have the following function to get the last access date of googlebot:
//get googlebot last access
function googlebot_lastaccess($domain_name)
{
$request = 'http://webcache.googleusercontent.com/search?hl=en&q=cache:'.$domain_name.'&btnG=Google+Search&meta=';
$data = getPageData($request);
$spl=explode("a开发者_运维问答s it appeared on",$data);
//echo "<pre>".$spl[0]."</pre>";
$spl2=explode(".<br>",$spl[1]);
$value=trim($spl2[0]);
//echo "<pre>".$spl2[0]."</pre>";
if(strlen($value)==0)
{
return(0);
}
else
{
return($value);
}
}
echo "Googlebot last access = ".googlebot_lastaccess($domain_name)."<br />";
function getPageData($url) {
if(function_exists('curl_init')) {
$ch = curl_init($url); // initialize curl with given url
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']); // add useragent
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable
if((ini_get('open_basedir') == '') && (ini_get('safe_mode') == 'Off')) {
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any
}
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5); // max. seconds to execute
curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error
return @curl_exec($ch);
}
else {
return @file_get_contents($url);
}
}
But this script prints me as result the snapshot of the whole page in screen, ie. the whole page cached in google but I want to capture only the date time after words as it appeared on
and print it ie.: 8 Oct 2011 14:03:12 GMT
.
How to?
Change this line:
echo "Googlebot last access = ".googlebot_lastaccess($domain_name)."<br />";
with this:
$content = googlebot_lastaccess($domain_name);
$date = substr($content , 0, strpos($content, 'GMT') + strlen('GMT'));
echo "Googlebot last access = ".$date."<br />";
Why query Google as to when it was last at your site when you can detect Googlebot on your site and what pages its on? It will also allow you to track where Googlebot went with a simple write to database function.
See Stack Overflow question how to detect search engine bots with php?
精彩评论