开发者

Is it viable to handle MySQL backups with git?

开发者 https://www.devze.com 2023-01-26 20:38 出处:网络
Today I had this really neat idea for backing up my database: put the dump file in a git repository, then commit on each dump so that I have both the most recent copy, but can easily roll back to any

Today I had this really neat idea for backing up my database: put the dump file in a git repository, then commit on each dump so that I have both the most recent copy, but can easily roll back to any previous backup. I can also easily pull a copy of the repository on a regular basis to keep a cop开发者_如何学Pythony on my own computer as a backup of the backups. It definitely sounds clever.

However, I'm aware that clever solutions sometimes have fundamental flaws. What sort of issues might I hit storing mysqldump diffs in git? Is it worth it? What do most people do in order to have multiple database backups on the server and keep redundant copies elsewhere?


Normally you don't keep every backup (or snapshot) forever. A git repository does keep every checkin you ever make. If you ever decide to prune old revisions (say month-old revisions down to once a week, year old to once a month, etc) you will have to do it with git filter-branch which will rewrite the entire history. Then git gc to remove the unwanted revisions.

Given that git's strengths are distributed version control and complex patch/branch workflows (neither of which apply to snapshots or backups) I'd consider using a different VCS with a more malleable history.


This approach sounds fine to me. I use Git for backing up my own important data.

Note that you are not storing diffs -- Git effectively stores snapshots of the directory state with each commit. You can generate the diff of two commits, but the actual storage mechanism has nothing to do with diff.


In theory this will work, but you will start to have problems when the database dumps get large.

Git doesn't have any hard file size limits, but it will diff the contents of your latest dump with the one previously stored in the repository, which will require at least as much memory as the sizes of both of those files added together - so I would imagine it will start to get very slow, very quickly with files over 100MB (or even 10MB).

Git wasn't made for dealing with files of this type (i.e. big data files instead of source code), so I think this is fundamentally a bad idea. You could, however, use something like Dropbox to store the dumps - which will still save the version history for you, but is more tailored towards files which can't effectively be diffed.


If you're using MySQL (and possibly others) and have binary logging enabled, you might consider setting up a git repo for the directory of your bin log and developing a strategy to regularly commit updates to the binlog.

In MySQL, the binlog stores the queries that change data to any tables on the database. If you sync up your commits with regular dumps of the database, you should have a versioned way to restore data.

Honestly, I think just using MySQL's native tools would probably be a better solution, but what I've outlined here gets you versioning your MySQL data which is what I think you were after in the first place.

0

精彩评论

暂无评论...
验证码 换一张
取 消