I'm looking for a way of scheduling tasks where a task starts once several previous tasks have completed.
I have several hundred "collector" processes which collect data from a variety of sources and dump it to a database. Once these have finished collecting (anywhere from 1 sec开发者_JS百科ond to a few minutes) I want to immediately kick off a bunch of "data-processing" processes to analyse and make sense of the data in the database. When all of these have finished I want a final task to start and send me an email of the summary data.
I'm currently using a Gearman queue and starting the data-processing tasks on timers once I expect the "collector" processes to have completed, but this means that the processing step starts after 10 minutes, even if the collector processes finished after 3 (or worse, have not yet finished).
Ideally I'd be able to specify specific rules like "start process X when process A and (B or C) complete", or "start process Y when 95% of the specified processes have completed or 10 minutes have elapsed".
The processes and dependencies need to be automatically created as it will be run with different parameters each time (ie. I'm not doing an identical calculation each time).
I could write some kind of graph-dependency framework myself using queues and monitors, but it seems like the sort of thing that must have already been solved and I'm looking for anyone who has used something like I describe.
"start process X when process A and (B or C) complete"
Why not let worker X launch subworkers A, B and C and wait for them to complete before proceeding? You can have a process X that is both a Gearman worker and a client at the same time.
You have some very peculiar conditions:
- B or C
- 95% complete or 10 minutes elapsed
At first I thought your processes were simply asynchronous. In that case you could use something called deferreds and promises. I'm using this a lot in JavaScript when dealing with ajax calls for data. With this you're basically configuring a dependency graph.
But your case is even more complex. Apparently you need an 'or', progress monitoring and timers.
This is all very much un-PHP like stuff. PHP has very poor cron job support, no support for asynchronous tasks and no timers. Why are you doing this in PHP?
精彩评论