开发者

How do you scale an application server that has daemon tasks?

开发者 https://www.devze.com 2023-03-08 01:34 出处:网络
I have a web application that runs on one server right now. I\'d like to switch to a cluster of application servers (jetty), to handle increased load and failover. However, the application has a coupl

I have a web application that runs on one server right now. I'd like to switch to a cluster of application servers (jetty), to handle increased load and failover. However, the application has a couple of daemon threads, which run once every 10 minutes in order to process data that has come in. This data must be processed only once (it communicates with external servers, and bad things happen if it's done twice).

What are the best practices for scaling that?

Some of the options I have are:

  1. Have a flag for whether or not the application should run the daemon tasks. Then only have one of them with that flag set to true. This works, but it means that I no longer have easy failover - I need to monitor that special application server and take action if it goes down.

  2. Work out some system where different application servers know about eachother and have some way of picking a node to run it, eg all pick a random number and whichever node is highest g开发者_如何学Goets to run it. Do that every 10 minutes. This has automatic failover (if other nodes can't communicate with one node because it's down, it just gets ignored), but it also means each application server needs to know about each other application server, and I feel like I'm reinventing the wheel here.

How is this situation typically handled?


You can use Quartz to schedule your tasks, it has cluster support.

Besides scheduling your tasks in Quartz, you'll have to create a database (or use an existing one) with Quartz's schema. All servers in the cluster must have their times synchronized (ntpd will do it).

Using Quartz will give you fail-over, load-balancing and the guarantee that each task will be only executed once.


Why not use the database to coordinate? Any node that has free cycles could insert an "in progress" row in a jobs table to lock out other nodes. This takes advantage of the fact that you probably already rely on a single database among all the nodes, that has built in transaction management.

You would need to devise a simple timing algorithm to ensure that all the nodes didn't wake up at the same time every ten minutes and fight for the lock. Maybe introduce a random delay of 0-10 seconds.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号