开发者

How can I add more tagged words to the Stanford POS-Tagger's trained models?

开发者 https://www.devze.com 2023-02-22 07:26 出处：网络

I haven\'t found anything in the documentation about adding more tagged words to the tagger, specifically the开发者_运维百科 bi-directional one.

相关专题：pos-tagger

I haven't found anything in the documentation about adding more tagged words to the tagger, specifically the开发者_运维百科 bi-directional one. Thanks

At present, you can't. Model training is an all-at-one-time operation. (Since the tagger uses weights that take into account contexts and frequencies, it isn't trivial to add new words to it post hoc.)

There is a workaround. It is ugly but should do the trick:

build a list of "your" words
scan text for these words
if any matches found to POS tagging yourself (NLTK can help you here)
feed it to Stanford parser.

FROM: http://www.cs.ucf.edu/courses/cap5636/fall2011/nltk.pdf "You can also give it POS tagged text; the parser will try to use your tags if they make sense. You might want to do this if the parser makes tagging mistakes in your text domain."