What I want is not a comparison between Redis and MongoDB. I know they are different; the performance and the API is totally different.
Redis is very fast, but the API is very 'atomic'. MongoDB will eat more resources, but the API is very very easy to use, and I am very happy with it.
They're both awesome, and I want to use Redis in deployment as much as I can, but it is hard to code. I want to use MongoDB in development as much as I can, but it needs an expensive machine.
So what do you think about the use of both 开发者_Python百科of them? When to pick Redis? When to pick MongoDB?
I would say, it depends on kind of dev team you are and your application needs.
For example, if you require a lot of querying, that mostly means it would be more work for your developers to use Redis, where your data might be stored in variety of specialized data structures, customized for each type of object for efficiency. In MongoDB the same queries might be easier because the structure is more consistent across your data. On the other hand, in Redis, sheer speed of the response to those queries is the payoff for the extra work of dealing with the variety of structures your data might be stored with.
MongoDB offers simplicity, much shorter learning curve for developers with traditional DB and SQL experience. However, Redis's non-traditional approach requires more effort to learn, but greater flexibility.
Eg. A cache layer can probably be better implemented in Redis. For more schema-able data, MongoDB is better. [Note: both MongoDB and Redis are technically schemaless]
If you ask me, my personal choice is Redis for most requirements.
Lastly, I hope by now you have seen http://antirez.com/post/MongoDB-and-Redis.html
I just noticed that this question is quite old. Nevertheless, I consider the following aspects to be worth adding:
Use MongoDB if you don't know yet how you're going to query your data.
MongoDB is suited for Hackathons, startups or every time you don't know how you'll query the data you inserted. MongoDB does not make any assumptions on your underlying schema. While MongoDB is schemaless and non-relational, this does not mean that there is no schema at all. It simply means that your schema needs to be defined in your app (e.g. using Mongoose). Besides that, MongoDB is great for prototyping or trying things out. Its performance is not that great and can't be compared to Redis.
Use Redis in order to speed up your existing application.
Redis can be easily integrated as a LRU cache. It is very uncommon to use Redis as a standalone database system (some people prefer referring to it as a "key-value"-store). Websites like Craigslist use Redis next to their primary database. Antirez (developer of Redis) demonstrated using Lamernews that it is indeed possible to use Redis as a stand alone database system.
Redis does not make any assumptions based on your data.
Redis provides a bunch of useful data structures (e.g. Sets, Hashes, Lists), but you have to explicitly define how you want to store you data. To put it in a nutshell, Redis and MongoDB can be used in order to achieve similar things. Redis is simply faster, but not suited for prototyping. That's one use case where you would typically prefer MongoDB. Besides that, Redis is really flexible. The underlying data structures it provides are the building blocks of high-performance DB systems.
When to use Redis?
Caching
Caching using MongoDB simply doesn't make a lot of sense. It would be too slow.
If you have enough time to think about your DB design.
You can't simply throw in your documents into Redis. You have to think of the way you in which you want to store and organize your data. One example are hashes in Redis. They are quite different from "traditional", nested objects, which means you'll have to rethink the way you store nested documents. One solution would be to store a reference inside the hash to another hash (something like key: [id of second hash]). Another idea would be to store it as JSON, which seems counter-intuitive to most people with a *SQL-background.
If you need really high performance.
Beating the performance Redis provides is nearly impossible. Imagine you database being as fast as your cache. That's what it feels like using Redis as a real database.
If you don't care that much about scaling.
Scaling Redis is not as hard as it used to be. For instance, you could use a kind of proxy server in order to distribute the data among multiple Redis instances. Master-slave replication is not that complicated, but distributing you keys among multiple Redis-instances needs to be done on the application site (e.g. using a hash-function, Modulo etc.). Scaling MongoDB by comparison is much simpler.
When to use MongoDB
Prototyping, Startups, Hackathons
MongoDB is perfectly suited for rapid prototyping. Nevertheless, performance isn't that good. Also keep in mind that you'll most likely have to define some sort of schema in your application.
When you need to change your schema quickly.
Because there is no schema! Altering tables in traditional, relational DBMS is painfully expensive and slow. MongoDB solves this problem by not making a lot of assumptions on your underlying data. Nevertheless, it tries to optimize as far as possible without requiring you to define a schema.
TL;DR - Use Redis if performance is important and you are willing to spend time optimizing and organizing your data. - Use MongoDB if you need to build a prototype without worrying too much about your DB.
Further reading:
- Interesting aspects to consider when using Redis as a primary data store
Redis. Let’s say you’ve written a site in php; for whatever reason, it becomes popular and it’s ahead of its time or has porno on it. You realize this php is so freaking slow, "I’m gonna lose my fans because they simply won’t wait 10 seconds for a page." You have a sudden realization that a web page has a constant url (it never changes, whoa), a primary key if you will, and then you recall that memory is fast while disk is slow and php is even slower. :( Then you fashion a storage mechanism using memory and this URL that you call a "key" while the webpage content you decide to call the "value." That’s all you have - key and content. You call it "meme cache." You like Richard Dawkins because he's awesome. You cache your html like squirrels cache their nuts. You don’t need to rewrite your crap php code. You are happy. Then you see that others have done it -- but you choose Redis because the other one has confusing images of cats, some with fangs.
Mongo. You’ve written a site. Heck you’ve written many, and in any language. You realize that much of your time is spent writing those stinking SQL clauses. You’re not a dba, yet there you are, writing stupid sql statements... not just one but freaking everywhere. "select this, select that". But in particular you remember the irritating WHERE clause. Where lastname equals "thornton" and movie equals "bad santa." Urgh. You think, "why don’t those dbas just do their job and give me some stored procedures?" Then you forget some minor field like middlename and then you have to drop the table, export all 10G of big data and create another with this new field, and import the data -- and that goes on 10 times during the next 14 days as you keep on remembering crap like salutation, title, plus adding a foreign key with addresses. Then you figure that lastname should be lastName. Almost one change a day. Then you say darnit. I have to get on and write a web site/system, never mind this data model bs. So you google, "I hate writing SQL, please no SQL, make it stop" but up pops 'nosql' and then you read some stuff and it says it just dumps data without any schema. You remember last week's fiasco dropping more tables and smile. Then you choose mongo because some big guys like 'airbud' the apt rental site uses it. Sweet. No more data model changes because you have a model you just keep on changing.
Maybe this resource is useful helping decide between both. It also discusses several other NoSQL databases, and offers a short list of characteristics, along with a "what I would use it for" explanation for each of them.
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
Difficult question to answer - as with most technology solutions, it really depends on your situation and since you have not described the problem you are trying to solve, how can anyone propose a solution?
You need to test them both to see which of them satisfied your needs.
With that said, MongoDB does not require any expensive hardware. Like any other database solution, it will work better with more CPU and memory but is certainly not a requirement - especially for early development purposes.
Redis is an in memory data store, that can persist it's state to disk (to enable recovery after restart). However, being an in-memory data store means the size of the data store (on a single node) cannot exceed the total memory space on the system (physical RAM + swap space). In reality, it will be much less that this, as Redis is sharing that space with many other processes on the system, and if it exhausts the system memory space it will likely be killed off by the operating system.
Mongo is a disk based data store, that is most efficient when it's working set fits within physical RAM (like all software). Being a disk based data means there are no intrinsic limits on the size of a Mongo database, however configuration options, available disk space, and other concerns may mean that databases sizes over a certain limit may become impractical or inefficient.
Both Redis and Mongo can be clustered for high availability, backup and to increase the overall size of the datastore.
All of the answers (at the time of this writing) assume each of Redis, MongoDB, and perhaps an SQL-based relational database are essentially the same tool: "store data". They don't consider data models at all.
MongoDB: Complex Data
MongoDB is a document store. To compare with an SQL-driven relational database: relational databases simplify to indexed CSV files, each file being a table; document stores simplify to indexed JSON files, each file being a document, with multiple files grouped together.
JSON files are similar in structure to XML and YAML files, and to dictionaries as in Python, so think of your data in that sort of hierarchy. When indexing, the structure is the key: A document contains named keys, which contain either further documents, arrays, or scalar values. Consider the below document.
{
_id: 0x194f38dc491a,
Name: "John Smith",
PhoneNumber:
Home: "555 999-1234",
Work: "555 999-9876",
Mobile: "555 634-5789"
Accounts:
- "379-1111"
- "379-2574"
- "414-6731"
}
The above document has a key, PhoneNumber.Mobile
, which has value 555 634-5789
. You can search through a collection of documents where the key, PhoneNumber.Mobile
, has some value; they're indexed.
It also has an array of Accounts
which hold multiple indexes. It is possible to query for a document where Accounts
contains exactly some subset of values, all of some subset of values, or any of some subset of values. That means you can search for Accounts = ["379-1111", "379-2574"]
and not find the above; you can search for Accounts includes ["379-1111"]
and find the above document; and you can search for Accounts includes any of ["974-3785","414-6731"]
and find the above and whatever document includes account "974-3785", if any.
Documents go as deep as you want. PhoneNumber.Mobile
could hold an array, or even a sub-document (PhoneNumber.Mobile.Work
and PhoneNumber.Mobile.Personal
). If your data is highly structured, documents are a large step up from relational databases.
If your data is mostly flat, relational, and rigidly structured, you're better off with a relational database. Again, the big sign is whether your data models best to a collection of interrelated CSV files or a collection of XML/JSON/YAML files.
For most projects, you'll have to compromise, accepting a minor work-around in some small areas where either SQL or Document Stores don't fit; for some large, complex projects storing a broad spread of data (many columns; rows are irrelevant), it will make sense to store some data in one model and other data in another model. Facebook uses both SQL and a graph database (where data is put into nodes, and nodes are connected to other nodes); Craigslist used to use MySQL and MongoDB, but had been looking into moving entirely onto MongoDB. These are places where the span and relationship of the data faces significant handicaps if put under one model.
Redis: Key-Value
Redis is, most basically, a key-value store. Redis lets you give it a key and look up a single value. Redis itself can store strings, lists, hashes, and a few other things; however, it only looks up by name.
Cache invalidation is one of computer science's hard problems; the other is naming things. That means you'll use Redis when you want to avoid hundreds of excess look-ups to a back-end, but you'll have to figure out when you need a new look-up.
The most obvious case of invalidation is update on write: if you read user:Simon:lingots = NOTFOUND
, you might SELECT Lingots FROM Store s INNER JOIN UserProfile u ON s.UserID = u.UserID WHERE u.Username = Simon
and store the result, 100
, as SET user:Simon:lingots = 100
. Then when you award Simon 5 lingots, you read user:Simon:lingots = 100
, SET user:Simon:lingots = 105
, and UPDATE Store s INNER JOIN UserProfile u ON s.UserID = u.UserID SET s.Lingots = 105 WHERE u.Username = Simon
. Now you have 105 in your database and in Redis, and can get user:Simon:lingots
without querying the database.
The second case is updating dependent information. Let's say you generate chunks of a page and cache their output. The header shows the player's experience, level, and amount of money; the player's Profile page has a block that shows their statistics; and so forth. The player gains some experience. Well, now you have several templates:Header:Simon
, templates:StatsBox:Simon
, templates:GrowthGraph:Simon
, and so forth fields where you've cached the output of a half-dozen database queries run through a template engine. Normally, when you display these pages, you say:
$t = GetStringFromRedis("templates:StatsBox:" + $playerName);
if ($t == null) {
$t = BuildTemplate("StatsBox.tmpl",
GetStatsFromDatabase($playerName));
SetStringInRedis("Templates:StatsBox:" + $playerName, $t);
}
print $t;
Because you just updated the results of GetStatsFromDatabase("Simon")
, you have to drop templates:*:Simon
out of your key-value cache. When you try to render any of these templates, your application will churn away fetching data from your database (PostgreSQL, MongoDB) and inserting it into your template; then it will store the result in Redis and, hopefully, not bother making database queries and rendering templates the next time it displays that block of output.
Redis also lets you do publisher-subscribe message queues and such. That's another topic entirely. Point here is Redis is a key-value cache, which differs from a relational database or a document store.
Conclusion
Pick your tools based on your needs. The largest need is usually data model, as that determines how complex and error-prone your code is. Specialized applications will lean on performance, places where you write everything in a mixture of C and Assembly; most applications will just handle the generalized case and use a caching system such as Redis or Memcached, which is a lot faster than either a high-performance SQL database or a document store.
And you should use neither if you have plenty of RAM. Redis and MongoDB come to the price of a general purpose tool. This introduce a lot of overhead.
There was the saying that Redis is 10 times faster than Mongo. That might not be that true anymore. MongoDB (if i remember correctly) claimed to beat memcache for storing and caching documents as long as the memory configurations are the same.
Anyhow. Redis good, MongoDB is good. If you care about substructures and need aggregation go for MongoDB. If storing keys and values is your main concern its all about Redis. (or any other key value store).
Redis and MongoDB are both non-relational databases but they're of different categories.
Redis is a Key/Value database, and it's using In-memory storage which makes it super fast. It's a good candidate for caching stuff and temporary data storage(in memory) and as the most of cloud platforms (such as Azure,AWS) support it, it's memory usage is scalable.But if you're gonna use it on your machines with limited resources, consider it's memory usage.
MongoDB on the other hand, is a document database. It's a good option for keeping large texts, images, videos, etc and almost anything you do with databases except transactions.For example if you wanna develop a blog or social network, MongoDB is a proper choice. It's scalable with scale-out strategy. It uses disk as storage media, so data would be persisted.
If your project budged allows you to have enough RAM memory on your environment - answer is Redis. Especially taking in account new Redis 3.2 with cluster functionality.
精彩评论