I have an application running under apache that I want to keep "in the moment" statistics on. I want to have the application tell me things like:
- requests per second, broken down by types of request
- latency to make requests to various backend services via thrift (broken down by service and server)
- number of errors being served per second
- etc.
I want to do this without any external dependencies. However, I'm running into issues sharing statistics betwe开发者_开发知识库en apache processes. Obviously, I can't just use global memory. What is a good pattern for this sort of issue?
The application is written in python using pylons, though I suspect this is more of a "communication across processes" design question than something that's python specific.
Perhaps you could keep the relevant counters and other statistics in a memcached, that is accessed by all apache processes?
I want to do this without any external dependencies.
What if your apache dies somehow? (Separation of concerns?)
Personally I am using (redundant) Nagios to monitor the hardware itself, services, and application metrics. This way i can easily/automatically plot "requests per second/users online", "cpu load/user activy X per second" etc. graphs which help with lots of things.
Writing plugins for nagios is really easy, also there are thousands of premade scripts in any language.
Apache monitoring
I am monitoring apache by extracting the information I need from the apache mod_status page via nagios plugin.
Example plugin response:
APACHE OK - 0.080 sec. response time, Busy/Idle 18/16, open 766/800, ReqPerSec 12.4, BytesPerReq 3074, BytesPerSec 38034
Application Monitoring
I used mod_status just as an example for your list of things you'd like to monitor.
For our application we have a very small framework for Nagios plugins, so basically every nagios check is a small class which runs its check against a cache or database and returns its value to nagios (small and simple commandline-script).
more examples:
Memcache:
OK - consumption: 82.88% (106.1 MBytes/128.0 MBytes), connections: 2, requests/s: 10.99, hitrate: 95.2% (34601210/36346999), getrate: 50.1% (36346999/72542987)
Application feature #1 usage:
OK - last 5m: 22 last 24h: 655 ever: 26121
Application feature #2 usage:
OK - last 5m: 39 last 24h: 11011
Other applications metrics:
OK - users online: 556
What I want to say: Extending Nagios for application monitoring is very easy. With my little framework which took me 3-4 hours to write, any check I am adding takes me just some minutes now.
Nagios plug-in development guidelines
Use pylons.g object. It is an instance of Globals class in your Pylons application's lib/app_globals.py file. Its state changes will be visible to all threads, so stuff in it needs to be threadsafe.
lib/app_globals.py:
class Globals(object):
def __init__(self):
self.requests_served = 0
controllers/status.py:
from pylons import g
class StatusController(BaseController):
def status(self):
g.requests_served += 1
return "Served %d requests." % g.requests_served
精彩评论