When my server gets into high load, a graceful restart of Apache seems to bring things back under control. So I set up monit, with this configuration:
set daemon 10
check system localhost
if loadavg (1min) > 5 then exec "/etc/init.d/开发者_如何学Pythonapache2 graceful"
So every 10 seconds, I poll the server load, and when it gets above 5, I gracefully restart Apache. However, that temporarily raises the load, so we get into a death spiral. What I want is for it to notice after 10 seconds that the load is 5 or more, and gracefully restart Apache, then wait for 5 minutes or so before checking that particular metric again.
Is there a way to do this with monit?
It's not entirely within monit, but it's close enough
set daemon 10
check system localhost
if loadavg (1min) > 5 then unmonitor
if loadavg (1min) > 5 then exec "/etc/init.d/apache2 graceful"
if loadavg (1min) > 5 then exec "python /scripts/remonitor.py"
Then you have a python script, like so:
import time, os
time.sleep(5*60)
os.system("monit monitor system")
So this will:
1. unmonitor "system" when it reaches too much load, to prevent the death spiral
2. restart apache gracefully
3. start the script that will re-monitor the "system" in 5 minutes
What about
set daemon 10
set limits { programtimeout: 300 seconds }
check system localhost
if loadavg (1min) > 5 then exec "/bin/sh -c '/etc/init.d/apache2 graceful && sleep 5m'"
or even
set daemon 10
check system localhost
start program = "/bin/sh -c '/etc/init.d/apache2 graceful && sleep 5m'" with timeout 330 seconds
if loadavg (1min) > 5 then start
I.e., just add the sleep 5m
shell command after the command to restart Apache and add the appropriate timeout to the monitrc.
精彩评论