Hi the aim i开发者_JS百科s to parse a sizeable corpus like wikipedia to generate the most probable parse tree,and named entity recognition. Which is the best library to achieve this in terms of performance and accuracy? Has anyone used more than one of the above libraries?
I use in my experiments the standford tagger but it really depends on the quality of your articles from wikipedia. Here you will find a comparison of different part-of-speech taggin implmentations - PoS on aclweb.
I'm currently using Enju HPSG parser which seems to be better than the others.
Refer to this paper: http://nlp.stanford.edu/pubs/lrecstanforddeps_final_final.pdf
精彩评论