开发者

Is there a standard pattern for scanning a job table executing some actions?

开发者 https://www.devze.com 2022-12-22 15:57 出处:网络
(I realize that my title is poor.If after reading the question you have an improvement in mind, please either edit it or tell me and I\'ll change it.)

(I realize that my title is poor. If after reading the question you have an improvement in mind, please either edit it or tell me and I'll change it.)

I have the relatively common scenario of a job table which has 1 row for some thing that needs to be done. For example, it could be a list of emails to be sent. The table looks something like this:

ID    Completed    TimeCompleted   anything else...
----  ---------    -------------   ----------------
1     No                           blabla
2     No                           blabla
3     Yes          01:04:22
...

I'm looking either for a standard practice/pattern (or code - C#/SQL Server preferred) for periodically "scanning" (I use the term "scanning" very loosely) this table, finding the not-completed items, doing the action and then marking them completed once done successfully.

In addition to the basic process for accomplishing the above, I'm considering the following requirements:

  • I'd like some means of "scaling linearly", e.g. running multiple "worker processes" simultaneously or threading or whatever. (Just a specific technical thought - I'm assuming that as a result of this requirement, I need some method of marking an item as "in progress" to avoid attempting the action multiple times.)
  • Each item in the table should only be executed once.

Some other thoughts:

  • I'm not particularly concerned with the implementation being done in the database (e.g. in T-SQL or PL/SQL code) vs. so开发者_如何学Cme external program code (e.g. a standalone executable or some action triggered by a web page) which is executed against the database
  • Whether the "doing the action" part is done synchronously or asynchronously is not something I'm considering as part of this question.


If you're willing to consider non-database technologies, the best (though not the only) solution is message queuing (often in conjunction with a database that contains each job's details). Message queues provide a lot of functionality, but the basic workflow is simple:

1) One process puts a 'job message' (perhaps just an id) on a queue.

2) Another process keeps an eye on the queue. It polls the queue for work, and pulls jobs it finds off the queue, one at a time, in the order they were received. Items you've pulled off the queue are effectively marked as 'in progress' - they are no longer available to other processes.

3) For critical workflows, you can perform a transactional read - in the event of a system failure, the transaction rolls back and the message is still on the queue. If there's some other kind of exception (like a timeout during a database read), you might just forward the message to a special error queue.

The simplest way to scale this is to have your reader process dispatch multiple threads to handle jobs it pulls off the queue. Alternately, you can scale out using multiple reader processes, which may be on separate servers.

.NET support includes Microsoft Message Queue, and either Windows Communication Foundation or the classes in the System.Messaging namespace. It requires some setup and configuration (you have to create the queues and configure permissions), but it's worth it.


In order to scale, you might want to consider scanning for jobs that are ready then adding them to a message queue. This way multiple consumers can read ready jobs off the queue. Marking jobs as "in progress" could be as simple as putting that value in the Completed column, or you could add a TimeStarted column and have a pre-determined timeout period before a job will be reset and be eligible for another worker thread to process. (The latter approach assumes the processing failed if the time elapses without the job completing. Failing after some number of attempts should call for manual inspection of that job.) The same daemon process that scans the database for ready jobs to add to the queue can look for jobs that have timed out.


If you're using SQL 2005+, you may want to investigate Service Broker. It's pretty much designed for this.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号