开发者

What situations require me to store different versions of the same data in a database?

开发者 https://www.devze.com 2023-01-11 16:49 出处:网络
This is a shot from Google BigTable paper What can be the kind of scenarios in which instead of ha开发者_如何学运维ving something like Oracle\'s redo logs,I will need to store multiple versions of

This is a shot from Google BigTable paper

What situations require me to store different versions of the same data in a database?

What can be the kind of scenarios in which instead of ha开发者_如何学运维ving something like Oracle's redo logs, I will need to store multiple versions of the same data within the database?

Coming specifically to this example, why do I need to store multiple versions of a html page in my database? It cannot act as a backup because anyways all the versions are not there, only some of them are (say last 5).


One thing to recognize with BigTable and similar non-relational stores, is that they have a totally different consistency model.

Once you introduce the concept of distributing your data across multiple nodes, you run the risk of consistency errors. Distributed databases are expected to be able to recover from inconsistencies results from nodes being down without shutting down the database or doing what would be considered a 'restore'.

Say you have a record stored in nodes 'A' and 'B'. In 'multi-master' replication, you don't have the concept of primary and copy. Rather it is possible for the record to be updated in both nodes simultaneously (especially if communication between the two is broken). Versioning can help resolve the resulting consistency issues.

Also, these databases tend not to do 'deletes'. You simply store a newer version that is marked as deleted (or expired, or whatever). Similarly a 'rollback' would be creating a newer version of a record from an earlier one.


Cases:

You want to know how daa has changed in the past. Examples: Tracking order status as it goes through the process. Track customer addresses even when they moved.

This can be a business requirement, or it can be - actually - a legal requirement. Quite often both.


Coming specifically to this example, why do I need to store multiple versions of a html page in my database?

In case you want to revert to a previous version.


Also useful for tracking changes for auditing/change logs (even if you can't revert, you can at least see who changed what when).


You are aware that (in addition to redo logs) Oracle also stores previous versions of the same data (in the undo tablespace)? This is called multi-version concurrency control and allows lock-free selects (you can select the previous value of a row that is being changed by an ongoing transaction, without having to wait for the new data to commit).


What situations : Retrieving a view of data 'as was' - this can be very useful for diagnosis (i.e. being able to re-run a process using the same data, without restoring the whole database). See Oracle's Flashback Query for a way to do this over short timescales.

We have a situation where business rules are soft-coded on site by the customer, and stored in the database. They may change at any moment, yet are used to calculate stored data. Versioning the configuration gives us a way to 'rollback' the configuration and understand how data was derived.

(I can't recall the specific term for Oracle's built-in row versioning, where it effectively stores a history table for each table).

Yes, versioning means a lot more storage, but I would say that in the places where it is useful, data is rarely volatile.

0

精彩评论

暂无评论...
验证码 换一张
取 消