开发者

SQL Azure Sharding and Social Networking Apps

开发者 https://www.devze.com 2023-02-10 03:20 出处:网络
The concept of sharding on SQL azure is one of the top recommended options to get over the 50Gb DB size limit, it has at the moment. A key strategy in sharding is to group related records called atomi

The concept of sharding on SQL azure is one of the top recommended options to get over the 50Gb DB size limit, it has at the moment. A key strategy in sharding is to group related records called atomic units together in a single shard , so that the applica开发者_如何学编程tion needs to only query a single SQL azure instance to retrieve the data.

However in applications such as Social networking Apps, grouping a atomic unit in a single shard is not trivial, due to the inter-connectivity of entities and records. what could be a recommended approach based on such a scenario?

Also in a sharded DB , what primary keys should be used for the tables ? Big Int or GUID. i currently use BIGINT Identity columns but if the data was to be merged for some reason this would be a problem due to conflicts between the values in different shards. i have heard some people recommend GUID's (UniqueIdentifier) but i'm wary on how this could affect performance. Indexing On-premise SQL servers with UniqueIdentifier columns is not possible, and i wonder how SQL azure implements similar strategies if i were to employ a UniqueIdentifier column.


For a social networking app, I'd presonally forgo using SQL and instead leverage a noSQL solution such as MongoDB or Azure Table Storage. These non-normalized but in-expensive systems allow you to create multiple entity datasets that are customized to your various indexing needs.

So instead of having something like... User1 -< relationshiptable -< User2

You'd instead have tables like Users User1's Friends User2's Friends

If Users 1 and 2 are both friends, then you'd have two entries to define that relationship, not one. But if makes retrieving a list of a specific user's friends trivial. It also now opens you up for executing tasks in parallel, by searching multiple index tables at a time.

This process scales extremely well, but does require that you invest more time in how the relationships are maintained. Admittedly, this is a simiplied example. Things get much more complex when you start discussing tasks like searching across your entire user base.

0

精彩评论

暂无评论...
验证码 换一张
取 消