My requirement is probably close to what one expects of an "Expert System". And looking for the simplest solution, that can give me real-time or near-real time inference, with some offline (non-realtime) learning capabilities.
To elaborate, my problem is --
Watch a log that is being updated live, and classify each entry as Red, Green and Blue. The classification into Red, Green, Blue is based on logic codified as production-rules (as I imagine it today).
The point where it gets challenging is --
1) Log entries tagged Blue will eventually have to be tagged red / green, based on subsequent log entries, where we hope to have more detailed information, so there is a bit of remembering to be done. The exact duration to wait, isn't known in advance, but there's a max limit. Of course, at any given point in time, there could be several hu开发者_运维技巧ndred-thousand entries that are tagged Blue.
2) The rules that determine Red & Green are not perfect, so sometimes mistakes happen with labeling. So an occasional manual audit reveals these mistakes. My main challenge is to see if I could automate some part of rule-updating, with minimal programming effort.
My (continuing study) reveals that RETE algorithm based rule-engine might serve my classification & labeling, including the re-labelling. If that works, I still need to figure how to automate the part of "learning from mistakes" ? Can one take a statistical approach -- s.a. Bayesian classification ? Also, could one take the Bayesian classification completely as against Rules-Engine, for the initial classification s.t. I've manually trained the system sufficiently ? Bayesian approach seems to "dumb down" the task of maintaining a correct set of rules, by "trust the statistics" approach, especially as there are these periodic manual audits.
PS> My main application is written in C++ (if that matters).
This sounds like Complex Event Processing (CEP), where you have rules and the ability to use time calculations like event X is within 2 minutes after event y.
In Java land, Drools Fusion (or Drools Expert) would handle that really well (I am biased though). In C++ land... well maybe you can set up a drools-camel-server
and communicate through XML with it.
精彩评论