开发者

Efficient natural language data structure, persistence and querying

开发者 https://www.devze.com 2023-01-04 09:46 出处:网络
For use in a language-learning web application, do you know of data structures and underlying database schema/ layout that would allow efficient storage, processing and querying of sentences, verbs, n

For use in a language-learning web application, do you know of data structures and underlying database schema/ layout that would allow efficient storage, processing and querying of sentences, verbs, nouns etc. for different natural languages? For example I would like to store each verb only once and link sentences to a verb object etc.

I came across concrete syntax trees and I am thinking of use an abstract Node class and derive Noun class from it etc. Would a syntax tree structure be too restrictive?

I realise this is quite a broad question and I do not expect you to do my 'homework' but if you could poin开发者_开发百科t me to any resources you know of that may help me get started that would be greatly appreciated.

Thank you

Martijn


Your example looks pretty solid in terms of natural language/sentences manipulation.

About other options.. for text search/storage, you might take a look at Patricia tree. There's implementation of it in Java on Google code.

Also, did you consider using one of existing solutions, like Hunspell, Lucene or Sphinx?

0

精彩评论

暂无评论...
验证码 换一张
取 消