开发者

Is there a clustered database which allows grouping the data along two dimensions?

开发者 https://www.devze.com 2023-03-16 07:55 出处:网络
Say I have entities to store. ATM it would be good enough to consider them blobs. I want the entities to be stored on a cluster. The key/ID of the en开发者_运维问答tity is a (x,y) integer coordinate.

Say I have entities to store. ATM it would be good enough to consider them blobs. I want the entities to be stored on a cluster. The key/ID of the en开发者_运维问答tity is a (x,y) integer coordinate. So they are basically located in a two dimensional grid. Updating any entity requires locking it's 4 neighbors. Since I want redundancy, I thought that the best would be to use the redundancy to ensure that the neighbors are always available. Here is what the distribution could look like:

   1  2  3  4  5  6
1 [F][F][E][E][G][G]
2 [F][F][E][E][G][G]
3 [D][D][A][A][B][B]
4 [D][D][A][A][B][B]
5 [H][H][C][C][I][I]
6 [H][H][C][C][I][I]

If A,B,C,D,E,F,G,H,I are servers, then A owns the (3,3) entity, and it needs to know (2,3) and (3,2) which belong to other servers. Arranged in blocks of 4, this always leaves two sides belonging to other servers. Using triple redundancy, I want to force a local copy of all neighbors. This would gives me in effect linear scalability.

Is there a database which allows me to define the sharding/replication key such that I can specify such a distribution, or is there a way of combining x and y into a single value that could be used to achieve this?

What I'm after is low latency and redundancy, not saving drive space. My entities have a "locality of reference" property; transactions only ever access the neighbors, but using the same key for an entity and it's neighbors would result in everyone have the same key.


I understand that duplication of BLOB data can be expensive storage wise, we want to limit redundancy. The benefit of sharding the data is that in an unnormalized way you can greatly speed up searching capabilities and performance.

This is just a thought but perhaps you could approach this in a normalized way instead by creating three tables:

Coordinates - Column (ID), Column (Xcor), Column (Ycor)

Data - Column - (ID), Column (Checksum?),

CoordinateData - Column (CoordinateID), Column (DataID)

With CoordinateData as a mapping table. This normally isn't ideal for indexes or searching however if you stored perhaps a checksum string, you could utilize some other medium for storing and locating raw data.

Like I said just an idea.

0

精彩评论

暂无评论...
验证码 换一张
取 消