开发者

Most efficient architecture for website using Cassandra and Solr?

开发者 https://www.devze.com 2023-02-07 15:38 出处:网络
I\'m developing a website that will use Cassandra for database storage and Solr to index and search some of the data contained in that database (only some of the data do I want searchable). I had inte

I'm developing a website that will use Cassandra for database storage and Solr to index and search some of the data contained in that database (only some of the data do I want searchable). I had intended to use PHP for server-side scripting, interfacing with the Cassandra database, and providing dynamic HTML content based on the contents of the database.

When a user commits something to the database, I envisioned PHP issuing the write to Cassandra, and if it were data that needed to be searched, that same data could be written to the Solr index. The thing is, I don't necessarily need the searchable data immediately available in the Solr index, nor do I want the process of adding it to the index through PHP consuming valuable resources, especially during peak traffic hours. I开发者_运维问答s there a way to have asynchronous updates to the Solr index occur in the background by transferring the data directly from Cassandra? Perhaps a queue of searchable data could be created that is used to update the Solr index during idle time by some background process?

I'm new to this whole thing, but I'd somehow like the link between Cassandra and Solr be insulated from the main PHP scripts. Not sure if Cassandra and Solr can be linked efficiently by Java, with only the higher-level access to both Cassandra (for reading/writing to the database) and Solr (for querying the searchable data) be maintained in PHP for web content creation. I appreciate any suggestions.


Rather than operating Solr and Cassandra separately, You should consider Solandra, a cassandra backend for solr.

Read more about it here: http://github.com/tjake/Lucandra


You have lots of options.

One simple one is to have a scheduled job, that grabs all your updates since the time of the last job run and do a batch insertion into solr.

Or you could do your cassandra post and then issue an async post to solr. as described here: How do I make an asynchronous GET request in PHP?

Since you don't need real time search, you could set a default commit size to be fairly large as well.

0

精彩评论

暂无评论...
验证码 换一张
取 消