We're running clusters of servers for a dozen customers. Each customer has a few app servers on Jetty. What's important here is:
- There are many Java processes to monitor.
- I need to be able to group (slice) them by machine and by customer (one machine can serve many customers, and each customer has many machines).
- I also need to collect and display some domain specific per-customer data.
I'd prefer to expose them with JMX, though I still am free to use other reasonable solutions.
What (preferably free & open source) server can开发者_StackOverflow中文版 I use for monitoring? I need something that will gather information from all those servers, keep history, and let me write my custom dashboard for presentation.
One solution I considered is Hyperic, but it's really unwieldy and the plugin development is horrible.
You might want to try Nagios/Centreon for this kind of thing - you can write plugins for it in Java (either get NRPE to run Java programs that connect to your server, or run jNRPE which implements the NRPE protocol, thus avoiding starting lots of short-lived JVMs) - use JMX Command Line Client (produced by someone like the Internet Archive) to query JMX beans.
There is also JConsole of course - I believe that you can write plugins for this, but it may not be up to the job!
Finally jstat produces lots of nice statistics about GC and associated processes, which you can pipe directly to a file. Not very graphical though.
You can try to look into Manageengine if you look into JMX-based solution. Or alterntivaly you can use Graylog2 to have Syslog or AMQP based solution.
You can try using BTrace to get the metrics you are after from your Jetty instances. BTrace will log its output to a set of files, which can then later be imported/exported automatically into a monitoring application.
I've written an article about BTrace at InfoQ that might be helpful to you: http://www.infoq.com/articles/java-profiling-with-open-source
The upside of using BTrace is that you can log in any format you'd like, and your can group your metrics to meet your requirements.
In the article, it explains how you can use EurekaJ to group, visualize and alert on the metrics gathered.
You can the group your metrics the way you'd like:
- customername:machine:application:path:to:metric
- machine:app:customer:path:to:metric
- app:machine:customer:path:to:metric
- etc.
With BTrace you can literally get out anything your want from your JVMs, method execution times, memory usage, threads, etc.
Use Nagios or one of its derivatives/forks (Opsview, Icinga)
You can set up two way monitoring
- Get nagios to check if your Jetty instances are running (poke them using a URL and ensure you get an expected positive response). There are standard plugins for this sort of keep alive check
- Also, get your java processes and web apps to send alerts (called passive checks in Nagios) to Nagios if something is wrong (you can use my library https://code.google.com/p/jsendnsca/ for this :-))
Not only will nagios monitor your java processes in this way but also any other services you are running across your estate (HTTP, FTP, SMTP services) as well as general tin health (CPU, Memory, Load Avg)
精彩评论