Goal: Store arbitrary performance statistics of stuff that you care about (how many customers are currently logged on, how many widgets are being processed, etc.) in a database so that you can understand what how your servers are doing over time.
Assumptions: A database is already available, and you already know how to gather the information you want and are capable of putting it in the database however you like.
Some Ideal Attributes of a Solution
- Causes no noticeable performance hit on the server being monitored
- Has a very high precision of measurement
- Does not store useless or redundant information
- Is easy to query (lends itself to gathering/displaying useful information)
- Lends itself to being graphed easily
- Is accurate
- Is elegant
Primary Questions
1) What is a good design/method/scheme for triggering the storing of statistics?
2) What is a good database design for how to actually store the data?
Example answers...that are sort of vague and lame...
1) I could, once per [fixed time interval], store a row of data with all the performance measurements I care about in each column of one big flat table indexed by timestamp and/or server.
2) I could have a daemon monitoring performance stuff I care about, and add a row whenever something changes (instead of at fixed time intervals) to a flat table as开发者_StackOverflow in #1.
3) I could trigger either as in #2, but I could store information about each aspect of performance that I'm measuring in separate tables, opening up the possibility of adding tons of rows for often-changing items, and few rows for seldom-changing items.
Etc.
In the end, I will implement something, even if it's some super-braindead approach I make up myself, but I'm betting there are some really smart people out there willing to share their experiences and bright ideas!
For the record, I agree somewhat with KM, it's hard to provide specific answers on the info given; and as is often the case - in this sort of senario you'll probably get more value out of thinking things through rather than th eend result.
For the storage of the data - the best reporting will be done of a DB that's designed to be reported off - an OLAP type schema.
How you get the data in there will be a different matter - how much data are we talking about and how do you want to move it across? the reason I ask is that if you're going to insert it in a synchronous manner you'll want it to be fast - a OLTP styled DB schema.
Strictly speaking, if you're after bleeding edge performance, you'll probably want a seperate DB for each part (capturing data / reporting off it).
Before you start - if you want elegance - you'll need to carefully consider the logical data model of the data you're wanting to pull in. High on your priority list should be core dimensions: time, origin (component, etc), and so on. This is one of the most important things about BI / data based projects - what questions are you trying to answer?
This is also where you'll start to figure out what you want to capture. Make sure you have good definitions of that data (where it comes from, what it means, etc). By 'where it comes from' I'm refering to not just the method / class / component / system, but also what actually originates the values you're recording and their meaning; this will be especially important for stuff like numbers of users logged in, what exactly does the figure mean? If its a web app, if you record every request you'll be able to report on the numbers of users "logged in" anyway you want: averages, by time of day, peak concurrency, etc.
One final point - depending on what you're doing (and how) the risk of performance loss, due to capturing too much data, is low; it's often better to have it and not need it - than to need it and not have it. Just because you have it doesn't mean tyou have to report on it.
Accuracy: use an existing well used industry component for capturing / logging data.
The MS Ent Libs are great for this, they have a large user base - and so their quality is high. They include a Trace statement for recording execution time down to a fine level.
They are also highly configurable - which helps contribute towards an elegant solution.
I think Does not store useless or redundant information
and Is easy to query (lends itself to gathering/displaying useful information)
are mutually exclusive. If you store data very efficiently, like in your example #2, you would need complex queries to recreate what was happening during a range or point in time, since you only store changes.
You really provide no details, so specific recommendations are difficult. For example, in your example #1, How many intervals per minute would you consider? this could affect your Causes no noticeable performance hit on the server being monitored
maybe not, depends on 1 per hour or 30 per minute.
You provide no information on the kinds of stats you are gathering, so table design is impossible.
Whatever you do, have the stats going to a database on a different server. This will give you the least performance impact on the production database.
Before you start, remember to have generic monitoring tools in place. Zabbix or Munin can tell you where the general resource bottlenecks are (e.g. memory, CPU, I/O), database-specific log-analysis tools like PgBadger can tell you what queries are slow, etc.
I will answer indirectly, by available technology and time to implement:
1) If you have many different metrics that you want to start logging as soon as possible, and can worry later about retrieving the data (for example, you don't know exactly what you want to measure, so you just want to dump everything you can), you could some NoSQL database like MongoDB or Redis.
Later, you can play with the data. Not very efficient for retrieval.
2) If you like agile approach, start small. Choose any backend (SQL, text files), and dump only some data, and later extend it with more columns depending on what first investigation shows you. Table design depends only on your app. For an MVC web application, it could be controller / action / user_id / total_pageload_time / memory_used. Then you can simply group by controller and see which parts are slow. SQLite works well for write-only data like this from my experience.
3) If you need to micro-optimize, you should probably look for official an unofficial plugins or external tools for your server software (e.g. PgBadger for PostgreSQL, etc).
Anyway, considering you "ideal attributes", I'd go with some kind of SQL server with small tables at first, adding tracked parameters incrementally instead of trying to come up with one perfect forever-lasting table design.
精彩评论