I am using curl and php to find out information about a given url (e.g. http status code, mimetype, http redirect location, page title etc).
$ch = curl_init($url); $useragent="Mozilla/5.0 (X11; U; Linux x86_64; ga-GB) AppleWebKit/532.9 (KHTML, like Gecko) Chrome/5.0.307.11 Safari/532.9"; curl_setopt($ch,CURLOPT_HTTPHEADER,array ( "Accept: application/rdf+xml;q=0.9, application/json;q=0.6, application/xml;q=0.5, application/xhtml+xml;q=0.3, text/html;q=0.2, */*;q=0.1" )); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($ch, CURLOPT_USERAGENT, $useragent); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $content=curl_exec($ch); $chinfo = curl_getinfo($ch); curl_close($ch);
This generally works well. However, if the url points to a larger file then I get a fatal error:
Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 14421576 bytes)
Is there anyway of preventing this? For example, by telling curl to give up if the file is too large, or by catching the error?
As a workaround, I've added
curl_setopt($ch, CURLOPT_TIMEOUT, 3);开发者_运维技巧 which assumes that any file that takes longer than 3 seconds to load will exhaust the allowed memory, but this is far from satisfactory.
Have you tried using CURLOPT_FILE
to save the file directly to disk instead of using memory? You can even specify /dev/null
to put it nowhere at all...
Or, you can use CURLOPT_WRITEFUNCTION
to set a custom data-writing function. Have the function just scan the headers and then throw away the actual data.
Alternately, give PHP some more memory via php.ini
.
If you're getting header information, then why not use a HEAD
request? That avoids the memory usage of getting the whole page in a maximumn 16MiB memory slot.
curl_setopt($ch, CURLOPT_HEADER, true);
Then, for the page title, use file_get_contents()
instead, as it's much better with its native memory allocation.
精彩评论