开发者

Event correlation and filtering - How to, where-to start?

开发者 https://www.devze.com 2023-02-21 07:59 出处:网络
Got an asynchronous stream of events, where each event has information like - Agency (one of many Agencies possible to be served by my solution)

Got an asynchronous stream of events, where each event has information like -

  • Agency (one of many Agencies possible to be served by my solution)
  • Agent (one of many Agents in an Agency)
  • Served-Entity (a person/organization served by 1 or more agencies)
  • Date+Time
  • Class-Data (tags from a fixed but large set of tags)

What I need to do is to --

  1. Correlate an event based on Served-Entity, Date+Time and Class-Data, and create a consolidated new Event. Example:

    Event #0021: { Agency='XYZ', Agent='ABC', Served-Entity='MMN', Date+Time='12-03-2011/11:03:37', Class-Date='missed-delivery,no-repeat,untracable,orphan' }

    Event #0193: { Agency='KLM', Agent='DAY', Served-Entity='MMN', Date+Time='12-03-2011/12:32:21', Class-Date='missed-delivery,orphan,lost' }

    Event #1217: { Agency='KLM', Agent='CARE', Served-Entity='MMN', Date+Time='12-03-2011/18:50:45', Class-Date='escalated' }

    Here I find 3 events which are spaced out in time (more than 7hr separation), which are for the same Served-Entity (MMN), occur within a certain time window (say 24-hours), have matching or related Class-Data.

  2. Finally create a consolidated (new) event which could represent an inference drawn.

  3. Be able to create reports on a per Agency, per Agency, per Served-Entity basis, based on things like specific Class-Data tags (e.g. missed-de开发者_如何转开发livery) over a certain period of time. This could be done using the original/input events, or the synthesized (inference) events.

  4. While this is not a requirement today, but quite likely to appear in future, that the "tags" that appear in Class-Data could grow, without any human intervention. So not sure if this should then be treated as unstructured data.

  5. Also not an immediate requirement, but in future there may be a need to identify trends / patterns of event occurrences (i.e. Event1 led to Event2 led to Event3).

The event arrival rate could be quite high... possibly thousands of events per minute. Maybe more. And, I need to archive the original/synthesized events for a period of time (a month or so).

My solution needs to be based on FOSS components (preferably). Some research done so far, points in the direction of CEP (Complex Event Processing), Bayesian-Networks/Classification, Predictive-Analytics.

Looking for some suggestions regarding approach to take. I'd prefer to take the path which meets most of my goals, with minimum difficulty/time, or to put another way, "learning AI" or "formal statistical methods" isn't my short-term goal :-)


Mike,

did you consider something like Esper/Nesper to see if they might meet your requirements ? while I've looked at something similar myself -- especially on Erlang (check my post here), and you'd find some useful answers there.

IC


Your problem is a tactical problem as opposed to a procedural problem. Both types have their own set of tools, and you will be in a world of pain if you try to solve a tactical problem with procedural tools.

Just to clarify terms, when i say procedural, I am talking about use cases where you can say do X, then Y, then Z. With Tactical problems, X, Y, and Z can occur at anytime, and you must be able to handle the event.

You are on the right track with CEP. You might also look into using a rules engine. You didn't mention what your dev environment is, but if its Java, you might take a look at Jess. If you really want a nice and robust rules engine, take a look at Tibco Business Events. It is very powerful and fault tolerant, but definitely not free.

0

精彩评论

暂无评论...
验证码 换一张
取 消