开发者

How does Mongo DB handle a large array field?

开发者 https://www.devze.com 2023-02-17 01:18 出处:网络
I\'m trying to store a list of ObjectIds in a document as an array field. I understand Mongo DB has a 4MB size limit for single documents. So considering the length of ObjectId is 12 bytes, a documen

I'm trying to store a list of ObjectIds in a document as an array field.

I understand Mongo DB has a 4MB size limit for single documents. So considering the length of ObjectId is 12 bytes, a document should be able to handle more than 300,000 entries in one array field. (Let me kn开发者_JAVA技巧ow if the calculation is off).

If the number of entries in the array gets close to that limit, what kind of performance can I expect? Especially when the field is indexed? Any memory issues?

Typical queries would look like below:

Query by a single value

db.myCollection.find(
  {
    myObjectIds: ObjectId('47cc67093475061e3d95369d')
  }
);

Query by multiple values

db.myCollection.find(
  {
    myObjectIds: {$in: [ObjectId('47cc67093475061e3d95369d'), ...]}
  }
);

Add a new value to multiple documents

db.myCollection.update(
  {
    _id: {$in: [ObjectId('56cc67093475061e3d95369d'), ...]}
  },
  {
    $addToSet: {myObjectIds: ObjectId('69cc67093475061e3d95369d')}
  }
);


TBH, I think the best thing you can do is to benchmark it. Create some dummy data, and test the performance as you increase the number of items in the array. It may be quicker to knock up a test in your environment - than wait for an answer here

It is one thing on my TODO list to investigate and blog about, but I haven't got round to it yet. If you do, I'd definitely be interested to see what your findings are! Likewise, if I get round to it soon I will post the results here too.


With the release of mongo 2.4 you can use capped arrays. On insert, you can tell mongo to $sort and $slice the array to keep it to a fixed length based on your criteria (if you don't care about throwing data away). For example, you could use this to save the most recent N entries in a data log.


You won't notice when you hit the document size limit unless you use getLastError after each update. The update will fail, and a message is logged to the database log. I have anecdotal evidence from my local ops guy that Mongo seems to be working harder when there are a lot of updates that fail because of the document size being reached.

I know of no easy way of avoiding it, other than designing around it. As far as I know there is no way to conditionally push to a list. I've seen other questions here on SO where people have been trying to build fixed size lists and such, but no good solutions have been found.

0

精彩评论

暂无评论...
验证码 换一张
取 消