How to design a SQL db with undo-redo?_问答_开发者

I'm trying to figure out how to design my DB tables to allow Undo-Redo.

Pretend you have a tasks table with the following structure:

id <int>
title <varchar>
memo <string>
date_added <datetime>
date_due <datetime>

Now assume that over a few days and multiple log-ins that several edits have taken place; but a user wants to go back to one of the versions.

Would you have a separate table tracking the changes - or - would you try to keep the changes within the tasks table ("ghost" rows, for lack of a better term)?
Would you track all of the columns or just the ones that changed each time?

If it matters, I'm using MySQL. Also, if it matters, I'd like to be able to show the history (ala Photoshop) and allow a user to switch to any version.

Bonus q开发者_开发技巧uestion: Would you save the whole memo cell on a change or would you try to save the delta only? Reason I ask is because the memo cell could be large and only a single word or character might be changed each revision. Granted, saving the delta would require parsing, but if undos aren't expected very often, wouldn't it be better to save space rather than processing time?

Thank you for your help.

I would create a History table for your tasks table. Same structure as tasks + a new field named previousId. This would hold the previous change id, so you can go back an forth through different changes (undo/redo).

Why a new History table? For a simple reason: do not overload tasks table with things that it was not designed for.

As for space, in the History, instead of a Memo, use a binary format and zip the content of the text you want to store. Don't try to detect changes. You will run into a buggy code which will result in frustration and wasted time...

Optimization: Even better, you may keep only three columns in History table: 1. taskId (foreign key to tasks) 2. data - a binary field. Before saving in the History table, create an XML string holding only the fields that have changed. 3. previousId (will help maintain a queue of changes and allow navigation back and forth)

As for data field, create an XML string like this:

<task>
  <title>Title was changed</title>
  <date_added>2011-03-26 01:29:22<date_added>
</task>

This will basically tell you that this time you changed only the title and the date_added fields.

After the XML string is built, just zip it if you want and store it into History table's data field.

XML will also allow for flexibility. If you add / remove a field in tasks table, you don't need to update the History table, too. So this way the structure of the tasks table and History table are decoupled so you don't need to update two tables each time.

PS: don't forget to add some indexes to quickly navigate through the history table. Fields to be indexed: taskId and previousId as you will need fast queries against this table.

Hope this helps.

When I do similar types of things using SQL I always use a second table for revision history. This prevents your primary table from getting overly large with versions. The rationale is that retrieving the record that is current happens almost 100% of the time, viewing history and rolling back (undo) is very infrequent.

If you only have a single UNDO or history, then tracking in-table is probably fine.

Whether you want to save deltas or the entire cell depends on expected growth / usage. If you are comfortable creating the logic to manage deltas, that will save you space. If things don't really create new versions that often I wouldn't start with that, (applying YAGNI)

You might want to compress revisions in delta form but you should still have the current revision in full for quick retrieval.

However, older to newer deltas require lots of processing unless you have some non-delta to base on. Newer to older deltas require reprocessing every time something changes. So deltas usually do not get you many benefits but greater complexity.

Last I checked, which is some years ago, MediaWiki, the software behind Wikipedia, stored full texts and provided some means to compress older revisions with gzip to save space and a dedicated table archive for deleted revisions / pages.

Their website has an ER diagram of their database layout which you might find useful.