I have a CGI script that takes a really long time to execute. Long story short, it needs to process a lot of data, run a bunch of slow commands, and make some slow web queries, during which time it doesn't output anything, and when it's done, it finally prints its results out in JSON format. It takes several minutes to run, which is longer than the Timeout directive set in my Apache web server's httpd.conf.
I am not at liberty to change that Timeout value globally for everyone on the entire server. I thought of maybe overriding that开发者_运维技巧 in a per-directory basis using a .htaccess file, but it looks like the Timeout directive is not in .htaccess context, so that cannot be done. From what I understand, my script must continually output data, and if it doesn't output data for the Timeout number of seconds, Apache gives up.
I am getting the following error in Apache: (70007)The timeout specified has expired: ap_content_length_filter: apr_bucket_read() failed
What can I do?
Well, to offer the stupidly simple solution, why not just make the script occasionally produce some output while it's working? You could just print "Processing..." every few steps, or if you want to be more creative, have it print some status updates to indicate what it's doing. Or if you're worried about getting bored, print out a funny poem a line at a time. (Kind of reminds me of http://pages.cs.wisc.edu/~veeve/404.html)
If you don't want to do that, the next thing that comes to my mind is to use asynchronous processing. Basically, you'll have to spawn a separate process from the CGI script, and do the lengthy processing in that separate process. The main CGI script itself just outputs a simple HTML page that says the process is working and then exits. That HTML page would also have to contain some logic for periodically checking to see whether the background process on the server has finished. It could be a <meta http-equiv="refresh" ...>
HTML element, or you could use AJAX.
I came up with a solution.
I would start outputting a dummy HTTP header, like Dummy: ...
, and I can put whatever data I want as the value of that header, and it wouldn't affect the rest of the output. So I would output a character to that dummy value every minute or so, preventing it from timing out. And when I am ready, I can print a line return and continue printing the rest of my (real) HTTP headers and the content of the document.
A very pragmatic approach could be to start a background job and email the response to the client. 1O-1 they'd prefer that rather than having a browser window open all afternoon.
精彩评论