开发者

Combine MySQL, Sphinx and MongDB. Good idea?

开发者 https://www.devze.com 2023-03-23 17:48 出处:网络
For a new project I\'m looking to combine MySQL, Sphinx and MongoDB. MySQL for the relational data and searching on numeric values, Sphinx for free text search and MongoDB for geodata. As far as my (q

For a new project I'm looking to combine MySQL, Sphinx and MongoDB. MySQL for the relational data and searching on numeric values, Sphinx for free text search and MongoDB for geodata. As far as my (quick) benchmarks shows MongoDB is the fastest for geo queries, sphinx for free text search and MySQL for relational data searches. So to get the best performance I might have to combine them in my project.

There are however three drawbacks to this.

  1. Three points of failure, i.e. Sphinx, MySQL, and MongoDB can crash which will stop my site
  2. I need data in three databases and need to keep them up to date (all data only changes ones per day so its not the worst problem).
  3. Hardware requirement开发者_开发技巧s and mainly RAM is going through the roof since all databases wants to have a large portion of the RAM to be able to perform.

So the questions is should I combine the three, leave one out (probably MongoDB and use Sphinx for geodata as well) or even go with only one (MongoDB or MySQL)?

To give an idea of the data, the relational data is aprox 6GB, the geodata about 4GB and the freetext data about 16GB.


Didn't quite understood if the records/collections/documents contained in the 3 dbs have inter-db references. EG if user names, jobs, telephone numbers are in Mysql and user addresses are in Mongo. I'll assume that the answer is Yes.

IMHO having 3 different storage solutions is not recommended, because:

1) (most important) You can not aggregate data from 2 DBs (in a scalable way).

Example: Let's say that you keep user data (user names) in Mysql and user geo coordinates in Mongo. You can't query having filters/sorts on fields located on both dbs. For example, you can't:

SELECT all users 
WHERE name starts with 'A'
SORT BY distance_from_center

Same applies for Sphinx.

Solution: you either limit to data available on a single DB, or you duplicate/mirror data from one db to another.

2) Maintenance costs: 3 servers to maintain, different backup/redundance strategies, different scaling strategies; Development costs: developer must use 3 querying libraries, with 3 different ways to query, etc etc.

3) Inconsistence/Synchronization issues that must be manually dealt with(EG you want to insert data both in mongo and in mysql; let's say that mongo wrote the data, but mysql raised a referential integrity exception, so now you have an inconsistency between dbs)

4) About HW costs, the only RAM-eater is MongoDB (the recommendation is that it has to have all indexes in ram). For MySQL and Solr servers, you can control memory consumption.

What I would do:

  • If I don't need all the SQL features (like transactions, referential integrity, joins, etc) I would go with Mongo

  • If I need those features, and I can live with a lower performance on geo operations, I would go with MySQL

  • now, If I need (I mean, I really really need) full-text search, and Mongo/Mysql FTS capabilities are not enough, I would attach also a FTS server like Sphinx, Solr, Elasticsearch, etc

0

精彩评论

暂无评论...
验证码 换一张
取 消