开发者

Cassandra or MySQL/PostgreSQL?

开发者 https://www.devze.com 2022-12-24 16:29 出处:网络
I have huge database (kinda wordnet) and want to know if it\'s easier to use Cassandra instead of MySQL|PostrgreSQL

I have huge database (kinda wordnet) and want to know if it's easier to use Cassandra instead of MySQL|PostrgreSQL

All my life I was using MySQL and PostrgreSQL and I could easily think in terms of relational algebra, but several weeks ago I learned about Cassand开发者_如何学Cra and that it's used in Facebook and Twitter.

Is it more convenient?

What DBMS are usually used nowadays to store social net's data, relationships between objects, wordnet?


There is nothing like a Silver bullet solution, everything is built to solve specific problem and has its own pros and cons. It is up to you to decide - what problem statement you have and what is best solution that fits your problem. Whether you use Cassandra (NoSQL) or MySQL(RDBMS), it is all driven from your system's requirements. Below are the inputs that will help you in taking better decision while deciding on database.

Why to Use NoSQL

In the case of RDBMS database, making choice is quite easy because almost all the databases like MySQL, Oracle, MS SQL, PostgreSQL in this category offer almost same kind of solutions oriented to the ACID property. When it comes to NoSQL, decision becomes difficult because every NoSQL database offers different solution and you have to understand which one is best suited for your app/system requirement. For example, MongoDB fits for use cases where your system demands schema-less document store. HBase might fit for Search engines, analysing log data, any place where scanning huge, two-dimensional join-less tables is a requirement. Redis is built to provide In-Memory search for varieties of data structures like tree, queue, link list etc and can be good fit for making real time leader board, pub-sub kind of system. Similarly there are other database in this category (including Cassandra) which fits for different problems. Now lets move to original question, and answer them one by one.

When to use Cassandra

Being a part of NoSQL family, Cassandra offers solution for problem where your requirement is to have very heavy write system and you want to have quite responsive reporting system on top of that stored data. Consider use case of Web analytics where log data is stored for each request and you want to built analytical platform around it to count hits by hour, by browser, by IP, etc in real time manner. You can refer to blog post (http://blogs.shephertz.com/2015/04/22/why-cassandra-excellent-choice-for-realtime-analytics-workload/) to understand more about the use cases where Cassandra fits in.

When to Use a RDMS instead of Cassandra/NoSQL

Cassandra is based on NoSQL database and does not provide ACID and relational data property. If you have strong requirement of ACID property (for example Financial data), Cassandra would not be a fit in that case. Obviously, you can make work out of it, however you will end up writing lots of application code to handle ACID property and will loose on time to market badly. Also managing that kind of system with Cassandra would be complex and tedious for you.


There are many different flavours of "NoSQL" databases. If your application is really like Wordnet perhaps you should look at a graph database such as Neo4j.


I would suggest to analyse your request.

  1. If you are going with more clusters, machines take NoSQL
  2. If your data model is complicated - require efficient structures take NoSQL (no limits with type of columns)
  3. If you fit in a few machines without scales, and you don't need super performance for multi request (as for example in social network - where lot of users send http request), and you don't think you involve saleability take RDBMS (Postgres have some good functions and structures which you can use, like array column type).

Cassandra should work better with large scales of data, multi purpose. neo4j - would be better for special structures, graphs.


Cassandra and other NoSQL stores are being used for social based sites because of their need for massive write based operations. Not that MySQL and Postgres can't achieve this but NoSQL requires far less time and money, generally speaking.

Sounds like you may want to look at Neo4J though, just in terms of your object model needs.


All different products and they all have their pro's and conn's. What kind of problem do you have to solve?

Huge, as in TB's?

0

精彩评论

暂无评论...
验证码 换一张
取 消