开发者

Faster alternative to file_get_contents()

开发者 https://www.devze.com 2022-12-27 02:12 出处:网络
Currently I\'m using file_get_contents() to submit GET data to an array of sites, but upon execution of the page I get this error:

Currently I'm using file_get_contents() to submit GET data to an array of sites, but upon execution of the page I get this error:

Fatal error: Maximum execution time of 30 seconds exceeded

All I really want the script to do is start loading the webpage, and then leave. Each w开发者_运维技巧ebpage may take up to 5 minutes to load fully, and I don't need it to load fully.

Here is what I currently have:

        foreach($sites as $s) //Create one line to read from a wide array
        {
                file_get_contents($s['url']); // Send to the shells
        }

EDIT: To clear any confusion, this script is being used to start scripts on other servers, that return no data.

EDIT: I'm now attempting to use cURL to do the trick, by setting a timeout of one second to make it send the data and then stop. Here is my code:

        $ch = curl_init($s['url']); //load the urls
        curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 1); //Only send the data, don't wait.
        curl_exec($ch); //Execute
        curl_close($ch); //Close it off.

Perhaps I've set the option wrong. I'm looking through some manuals as we speak. Just giving you an update. Thank you all of you that are helping me thus far.

EDIT: Ah, found the problem. I was using CURLOPT_CONNECTTIMEOUT instead of CURLOPT_TIMEOUT. Whoops.

However now, the scripts aren't triggering. They each use ignore_user_abort(TRUE); so I can't understand the problem

Hah, scratch that. Works now. Thanks a lot everyone


There are many ways to solve this.

You could use cURL with its curl_multi_* functions to execute asynchronously the requests. Or use cURL the common way but using 1 as timeout limit, so it will request and return timeout, but the request will be executed.

If you don't have cURL installed, you could continue using file_get_contents but forking processes (not so cool, but works) using something like ZendX_Console_Process_Unix so you avoid the waiting between each request.


As Franco mentioned and I'm not sure was picked up on, you specifically want to use the curl_multi functions, not the regular curl ones. This packs multiple curl objects into a curl_multi object and executes them simultaneously, returning (or not, in your case) the responses as they arrive.

Example at http://php.net/curl_multi_init


Re your update that you only need to trigger the operation:

You could try using file_get_contents with a timeout. This would lead to the remote script being called, but the connection being terminated after n seconds (e.g. 1).

If the remote script is configured so it continues to run even if the connection is aborted (in PHP that would be ignore_user_abort), it should work.

Try it out. If it doesn't work, you won't get around increasing your time_limit or using an external executable. But from what you're saying - you just need to make the request - this should work. You could even try to set the timeout to 0 but I wouldn't trust that.

From here:

<?php
$ctx = stream_context_create(array(
    'http' => array(
        'timeout' => 1
        )
    )
);
file_get_contents("http://example.com/", 0, $ctx);
?>

To be fair, Chris's answer already includes this possibility: curl also has a timeout switch.


it is not file_get_contents() who consume that much time but network connection itself.
Consider not to submit GET data to an array of sites, but create an rss and let them get RSS data.


I don't fully understands the meaning behind your script. But here is what you can do:

  1. In order to avoid the fatal error quickly you can just add set_time_limit(120) at the beginning of the file. This will allow the script to run for 2 minutes. Of course you can use any number that you want and 0 for infinite.
  2. If you just need to call the url and you don't "care" for the result you should use cUrl in asynchronous mode. This case any call to the URL will not wait till it finished. And you can call them all very quickly.

BR.


If the remote pages take up to 5 minutes to load, your file_get_contents will sit and wait for that 5 minutes. Is there any way you could modify the remote scripts to fork into a background process and do the heavy processing there? That way your initial hit will return almost immediately, and not have to wait for the startup period.

Another possibility is to investigate if a HEAD request would do the trick. HEAD does not return any data, just headers, so it may be enough to trigger the remote jobs and not wait for the full output.

0

精彩评论

暂无评论...
验证码 换一张
取 消