I'm looking for a java driven solution to a requirement for analysing sentences to log whether a key word was used positively or negatively.
Ie The key word might be 'cabbages' and the sentence:-
'I like cabbages but not peas'
And I'd like a java text analyser of some kind to log this as positive. Can the lucene (Hibernate-Search) libraries be utilized 开发者_开发百科to for this?
Any thoughts?
You're looking for "sentiment analysis". One possibility is LingPipe, who kindly link to their competitors also. Jeff Dalton also has a great list of natural language processing tools in his blog.
I doubt there's anything like that. Lucene definitely can't do it out of the box.
How do you even define "whether a key word was used positively or negatively" in a way that can be evaluated programmatically? To do it properly, you'd have to analyse the text for their actual meaning, which is an AI problem that is not even remotely solved.
I suppose you could solve it approximately by just doing a statistical analysis of whether the keyword appears more often close to positive (like, good, great, wonderful) or negative (bad, hate, crappy, damn) keywords, but even there, negations, sarcasm and complex sentence structures will be problematic.
Take a look at Mahout Taste, which builds on Lucene but adds a lot of what you need out of the box. (edit) I should add, Mahout Taste is merely related to what you're looking for and not a 100% match.
精彩评论