I currently have a script that loads a page on my clients other server using cURL. Currently, the settings are
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
curl_setopt($ch,CURLOPT_USERAGENT,$useragent);
curl_setopt($ch, CURLOPT_HEADER, 0);
$usecookie = ROOT_PATH . "/public_html/football_parser/cookie.txt";
if($usecookie) {
if (!is_writable($usecookie)) {
开发者_如何学C return "Can't write to $usecookie cookie file, change file permission to 777 or remove read only for windows.";
}
curl_setopt($ch, CURLOPT_COOKIEJAR, $usecookie);
curl_setopt($ch, CURLOPT_COOKIEFILE, $usecookie);
}
$output = curl_exec($ch);
I'm trying to load the two example urls
statto.com/football/teams/newcastle-united/2005-2006/results
and
statto.com/football/teams/newcastle-united/2008-2009/results
The second loads without any problems. The first fails to load without curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE)
being set. When it does load however, it redirects to an error page, but it is fine in my brower. I've been told that there is a 307 redirect on this page that switches between the page I see in my browser and the 404 error page I get in cURL. I can make this error page appear in my browser if I delete the cookie UID, but I've checked the cookie file on my server and it seems to be set ok and present.
Can anyone tell me how I would cURL the first url and see what I see in my browser, not the 404 redirect?
Many Thanks
Michelle
When I view the first URL in my browser in incognito mode (clean cookie jar) the following happens:
307 redirect
Cache-Control:post-check=0, pre-check=0
Cache-Control:no-store, no-cache, must-revalidate
Connection:Keep-Alive
Content-Encoding:gzip
Content-Length:20
Content-Type:text/html
Date:Mon, 10 Sep 2012 08:30:40 GMT
Expires:Mon, 10 Sep 2012 08:30:40 GMT
Keep-Alive:timeout=5, max=50
Last-Modified:Mon, 10 Sep 2012 08:30:40 GMT
Location:/home/error/404
MS-Author-Via:DAV
Pragma:no-cache
Server:Apache
Set-Cookie:options=DD0505030; expires=Tue, 10-Sep-2013 08:30:40 GMT; path=/; domain=www.statto.com
Set-Cookie:uid=3bdb30f60000-00-00USbf62da837b5bb608f95715dea80a8efa; expires=Tue, 30-Oct-2012 08:30:40 GMT; path=/; domain=www.statto.com
Vary:Accept-Encoding
X-Powered-By:PleskLin
X-Robots-Tag:index, noarchive
As you can see, Location:/home/error/404. Thus this behavior is simply due to the fact that this website seems to have made a mistake (I can't discern any possible reason for this to be correct behavior). Anyways, in order to compensate for their mistake, you'll have to first set the cookie (make the request to this page and get redirected to the 404 error page) and then request the page AGAIN with the cookie that you generated the last time around.
Hopefully you can just do:
$output = curl_exec($ch);
$output = curl_exec($ch);
I can't actually remember if curl handles need to be reset or something, if this doesn't work try to make another curl handle with pretty much exactly the same options as you used above and execute it after you execute your first curl handle.
精彩评论