开发者

How can I add more tagged words to the Stanford POS-Tagger's trained models?

开发者 https://www.devze.com 2023-02-22 07:26 出处:网络
I haven\'t found anything in the documentation about adding more tagged words to the tagger, specifically the开发者_运维百科 bi-directional one.

I haven't found anything in the documentation about adding more tagged words to the tagger, specifically the开发者_运维百科 bi-directional one. Thanks


At present, you can't. Model training is an all-at-one-time operation. (Since the tagger uses weights that take into account contexts and frequencies, it isn't trivial to add new words to it post hoc.)


There is a workaround. It is ugly but should do the trick:

  • build a list of "your" words
  • scan text for these words
  • if any matches found to POS tagging yourself (NLTK can help you here)
  • feed it to Stanford parser.

FROM: http://www.cs.ucf.edu/courses/cap5636/fall2011/nltk.pdf "You can also give it POS tagged text; the parser will try to use your tags if they make sense. You might want to do this if the parser makes tagging mistakes in your text domain."

0

精彩评论

暂无评论...
验证码 换一张
取 消