I am interested in learning about text classification so is reading up on the theory. Next step is doing stuff and therefore I am looking for and at different tools. Some links point to WEKA, however Mallet seems to be a better fit for this task but nobody links to this tool. Are there any reason to stay away from Mallet if wanting to work on a "serious" project ? I was able to quickly train some classifie开发者_运维技巧rs with Mallet and test them, whereas with WEKA I run into a problem with my labels "disappearing" after using filters to transform my textfiles in maps named with the category of the texts within it.
It depends on the task you are performing. Mallet is also a popularly used tool and both Weka and Mallet have their pros and cons. For trivial tasks, both are easy to use. I generaly prefer Weka for clustering and classification tasks.
Note: Do not be misled by popularity of Weka on forum posts, it is primarily to do with it being used for a longer period of time and Mallet is new as compared to Weka.
精彩评论