I'm writing a program in Java that needs to parse natural language. I need this to be done using probability and statistics. Are there any开发者_如何学运维 resources that can easily explain Statistical Natural Language Processing techniques?
A commonly cited "introductory" reference is Foundations of Statistical Natural Language Processing (1999) by Manning & Shutze. While comprehensive, relatively accessible and certainly a excellent reference, this may be overkill for a more casual introduction to the field.
You can maybe find some online courses such as Short course on Statistical Methods in NLP
And also, since you mentioned java, you can find a generic "toolbox" such as
- Weka
- Stanford Core NLP
- openNLP
- GATE
and start getting hands-on exposure to specific areas of NLP such as, say, POS Tagging or Entity Extraction.
Also worthy of mention, 'though it is related to a the Python-based NLTK, the Natural Language Processing with Python online (and hardcopy) book constitutes a very practical guide into common NLP tasks.
There is a bit of a catch-22 with getting one's feet wet with NLP: it is a rather extensive field of study and practice. It is rife with both scholarly research and time and industry tested practices and libraries. Until one has a better grasp of the particular applications of NLP that are suitable for a given problem, one may waste a lot time poking technologies that are either immature or not well suited to the problems at hand.
精彩评论