I want to know the list of algorithms which I can do experiment to predict the probability of cancer or feve开发者_如何学Gor or whatever in patient based from set of inputs...Please assume that I have data in millions so I want to try the best algorithms to predict that...I am really new to data mining and machine learning....
One of the current and most popular algorithms for prediction and classification is Random Forests (RF) by Leo Breiman. Its implementation is available in weka, too.
If you are looking specifically at estimating probabilities of something, then you need to use a machine learning approach that generates probabilities. Most only generate a class label: yes/no.
The most well-known algorithm for estimating probabilities is Logistic Regression. An implementation is available in Weka.
The question being a bit vague, I can only give a vague answer : use the almighty SVM! Feed the SVM classifier with your millions of input vectors, and it should be able to give you state-of-the-art predictions afterwards.
If you're looking for an implementation of SVM, have a look at libsvm, which has wrappers in almost every decent programming language.
Most popular tool for starting experiment with large amount of machine learning libs is Weka. Here you can upload your data and try many algorithms. It's weaknes is scalability, but it is not problem for plaing with data.
精彩评论