开发者

Replication - syncronizing most of the data some of the time

开发者 https://www.devze.com 2022-12-28 04:12 出处:网络
I have some data that isn\'t properly \"partitioned\" (for lack of a better word). All inserts, processing and reporting happen on the same table.The bulk of the processing happens not long after the

I have some data that isn't properly "partitioned" (for lack of a better word).

All inserts, processing and reporting happen on the same table. The bulk of the processing happens not long after the insert and not long after that it becomes immutable (we're talking days).

I could do all inserts and processing on a new table that I replicate to the old table. When I detect that the data has become immutable I would delete the data from the new table, but I would edit the delete replication stored procedure so that the delete did not replicate.

How bad an idea is this? <edit1>That is, editing the replication stored procedure.</edit1>

It seems attractive at the moment (I haven't slept on it yet) because it might mitigate a performance problem with only very small changes to the application. It also seems like it might be a good way to shoot myself in the foot.

Edit1:

I like the idea of inserting into two tables because I can avoid the view and the maintenance window described in Jono's answer. No offense, Jono, I actually use this technique elsewhere.

I might want to use replication开发者_运维问答 because one table might be in another database (I know, I didn't mention this) and that way I don't have to worry about committing to two tables, I just let replication handle that.

My actual concern (that I didn't make clear) is that editing the replication stored procedure could end up being a deployment/maintenance headache.


I wouldn't advocate replication to solve a performance issue (unless it's a problem of physical data distribution); if anything it's going to slow your system down as the changes are propagated to their destination. If you're using a single server, I'd suggest adding a second table with the same schema as the first, but with your indexes optimised for the kind of work you do in your processing phase. Then create a view that selects from both tables, and use that view in any query where you want the union of both tables. You could then throw more hardware at the second table (I'm thinking of a separate file group over more spindles) and then migrate the data on a weekly delay into the first table, during an available maintenance window.

0

精彩评论

暂无评论...
验证码 换一张
取 消