开发者

Java Support for PMML

开发者 https://www.devze.com 2023-04-01 18:49 出处:网络
I am new in PMML: Predictive Model Markup Language (www.dmg.org) and I was wondering if there is some kind of Java s开发者_如何学Goupport (Open Source / professional) for creating/parsing PMML files.

I am new in PMML: Predictive Model Markup Language (www.dmg.org) and I was wondering if there is some kind of Java s开发者_如何学Goupport (Open Source / professional) for creating/parsing PMML files.

Initially I only have in mind the possibility of creating/parsing PMML files programatically from Java environments.

I have been "googling" and I have found several possibilities:

Open source:

  • jpmml. (PMML 3.2).

From Java.

  • JDM. javax.datamining. Seems it a dead ? Someone has more info?

Professional.

  • Zementis (http://www.zementis.com/pmml_tools.htm).

DIY

  • Use an XML Java library and build yourself a parser/writer of PMML files

I appreciate all your opinions.

Thanks in advance

Oscar


You should realize that the answer may depend on the MODEL-ELEMENT that you want to work with. It is also very likely that your best options for creating PMML and parsing PMML will come from different software packages. I am going to assume that by 'creation of PMML' you mean of the document and not of the model. I've never heard of anyone integrating automatic model fitting with execution but perhaps it exists already. Certainly a PMML model could be passed using SOAP.

I can't speak to the other projects but the product offered by Zementis, called Adapa, is used only for the execution of PMML. This product assumes that there is a model fitting application that will do the creating by exporting a fitted model into PMML. There are already a lot of well developed model fitting applications so I think this is a reasonable assumption.

The version I have used (3.6) was generally fast but it couldn't handle ensembles of typical random forest size (500+ trees) without an especially large heap. I think they may have fixed this in newer versions. Though it isn't advertised, Zementis doesn't appear to offer a few of the models, namely Text Models, Sequences, Baseline Models, or Time Series (for which the PMML standard currently only has Exponential Smoothing anyway). My version also doesn't have K-Nearest Neighbors but I hear that more recent versions do.

Unless you are considering integrated fitting and execution (in which case you should consider online learning) my advise would be to consider these questions in order:

  1. What is the model type that I am interested in using?
  2. What application/s do I prefer to build models in?
  3. Then lastly how will I execute this and what requirements do I have in this regard (web-services, cloud, performance etc)?

If you look at the list of members to the DMG group you will find many commercial vendors that are either on the supply side (eg. SAS, SPSS, Togaware, Rapid-I) or the demand side (so many to list).

On your list you also didn't mention Weka but they also execute some PMML models and there are R/Java based solutions and so you could execute PMML->R imports (see fileToXMLNode) in a Java environment (but you could also just execute R).

Finally, if you have a very specific model in mind and you understand what it means mathematically to 'execute it' then it shouldn't be too difficult to build what you need yourself.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号