Hi I want to use MALLET's topic modeling but can i provide my own tokenizer or tokenized version o开发者_高级运维f the text documents when i import the data into mallet? I find MALLET's tokenizer inadequate for my usage...
Ok, I got it. Simply replace the default tokenizer with my own into the serial pipe and add it into the instance list.
精彩评论