开发者

Mongo sharding fails to split large collection between shards

开发者 https://www.devze.com 2023-01-16 02:43 出处:网络
I\'m having problems with what seems to be a simple sharding setup in mongo. I have two shards, a single mongos instance, and a single config server set up like this:

I'm having problems with what seems to be a simple sharding setup in mongo.

I have two shards, a single mongos instance, and a single config server set up like this:

Machine A - 10.0.44.16 - config server, mongos

Machine B - 10.0.44.10 - shard 1

Machine C - 10.0.44.11 - shard 2

I have a collection called 'Seeds' that has a shard key 'SeedType' which is a field that is present on every document in the collection, and contains one of four values (take a look at the sharding status below). Two of the values have significantly more entries than the other two (two of them have 784,000 records each, and two have about 5,000).

The behavior I'm expecting to see is that records in the 'Seeds' collection with InventoryPOS will end up on one shard, and the ones with InventoryOnHand will end up on the other.

However, it seems that all records for both the two larger shard keys end up on the primary shard.

Here's my sharding status text (other collections removed for clarity):

--- Sharding Status ---
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
      { "_id" : "shard0000", "host" : "10.44.0.11:27019" }
      { "_id" : "shard0001", "host" : "10.44.0.10:27017" }
  databases:
        { "_id" : "admin", "partitioned" : false, "primary" : "config" }
        { "_id" : "TimMulti", "partitioned" : true, "primary" : "shard0001" }
                TimMulti.Seeds chunks:
                        { "SeedType" : { $minKey : 1 } } -->> { "SeedType" : "PBI.AnalyticsServer.KPI" } on : shard0000 { "t" : 2000, "i" : 0 }
                        { "SeedType" : "PBI.AnalyticsServer.KPI" } -->> { "SeedType" : "PBI.Retail.InventoryOnHand" } on : shard0001 { "t" : 2000, "i" : 7 }
                        { "SeedType" : "PBI.Retail.InventoryOnHand" } -->> { "SeedType" : "PBI.Retail.InventoryPOS" } on : shard0001 { "t" : 2000, "i" 开发者_如何学Go: 8 }
                        { "SeedType" : "PBI.Retail.InventoryPOS" } -->> { "SeedType" : "PBI.Retail.SKU" } on : shard0001 { "t" : 2000, "i" : 9 }
                        { "SeedType" : "PBI.Retail.SKU" } -->> { "SeedType" : { $maxKey : 1 } } on : shard0001 { "t" : 2000, "i" : 10 }

Am I doing anything wrong?

Semi-unrelated question:

What is the best way to atomically transfer an object from one collection to another without blocking the entire mongo service?

Thanks in advance, -Tim


Sharding really isn't meant to be used this way. You should choose a shard key with some variation (or make a compound shard key) so that MongoDB can make reasonable-size chunks. One of the points of sharding is that your application doesn't have to know where your data is.

If you want to manually shard, you should do that: start unlinked MongoDB servers and route things yourself from the client side.

Finally, if you're really dedicated to this setup, you could migrate the chunk yourself (there's a moveChunk command).

The balancer moves chunks based on how much is mapped in memory (run serverStatus and look at the "mapped" field). It can take a while, MongoDB doesn't want your data flying all over the place in production, so it's pretty conservative.

Semi-unrelated answer: you can't do it atomically with sharding (eval isn't atomic across multiple servers). You'll have to do a findOne, insert, remove.

0

精彩评论

暂无评论...
验证码 换一张
取 消