We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
开发者_如何学JAVA Improve this questionI'm looking for a library to simply validate the the syntax of english natural language sentences. It doesn't have to be correct all the time (and obviously some sentences will be ambiguous/ humans will disagree on validity).
So for example: jim likes the blue ball would be valid, whereas jim likes likes blue ball jim would not be.
I've tried "Syntactic parser of English sentences" by Andrej Pancik which appears to do what I want, but unfortunately most sentences I'd consider to be "valid" it doesn't consider to be.
Is there any code out there I can use? Otherwise I'm thinking to do this myself by creating parse tree with something like ANTLR and identifying nouns with WordNet.
You won't find this a) easy to do, or b) likely available as a package that just works.
People don't agree on what English is
Colorless green ideas slept furiously.
thus you can't really write such a program that relaibly does what you want. There are NLP parsers that claim to process much of English, but they aren't simple or small; I belive the so-called Stanford parser is one.
You can try to build you own, but you'll smack into the definition-of-English problem, unless you strongly constrain what you consider to be valid english. And this will likely get you the same effect as you had with Pancik's parser. (The act of writing a parser is an insistence that the language looks like what the parser accepts, regardless of the truth).
Syntactic parsing is a broad research field. There are a lot of parsers available, but not in C#. The state-of-the-art parsers are listed in: http://aclweb.org/aclwiki/index.php?title=Parsing_(State_of_the_art)
A gentler starting point is NLTK, written in python.
精彩评论