开发者

Efficiently backup many versions of a git repo with branch namespacing

开发者 https://www.devze.com 2023-01-07 05:19 出处:网络
At work, we use Perforce for version control.There are problems with this: 1) with this centralized model, we can\'t check in changes until they are ready for regression.This means that we have no rev

At work, we use Perforce for version control. There are problems with this: 1) with this centralized model, we can't check in changes until they are ready for regression. This means that we have no revision control during the development process. 2) We don't back up our client view of the depot, so our work is unsafe until we can check it in. 3) We have problems sharing our code with each unless we beg to have an integration branch set up. I am trying to set up an optional git workflow for developers who want to use git to beat these problems.

The plan is to use git-p4 to interface with the perforce server and create a private git repo. This takes care of 1). I plan to use the integration-manager workflow depicted in Git Pro (http://progit.org/book/ch5-1.html) to have our developers publish public repos, taking care of 3).

Efficiently backup many versions of a git repo with branch namespacing

Finally, I want a place where developers can push their changes so that they will pulled into nightly backups / offsite backups. The reason we don't backup our client views now is because doing nightly archival backups of everyone's client view is space inefficient. We have a lot of developers, and they produce a lot of code. We can't be redundantly backing up everyone's client view. We only want to preserve the unique changes that they are making only.

My thinking was to have one bare git repo, call it omni-backup, that everyone can push all of their branches to (and feel free to suggest alternatives). This would utilize git's space efficient sha-1 hashing and ensure that only unique versions of each file are backed up. The trick is that all the backup repositories have to be part of the same repo to get the space efficiency.

The problem is when two people with completely different branches chose the same name for their branch. E.G. Bob has a feature branch and Jane has a feature branch, but they're for different features. If Bob pushes to omni-backup, Jane won't be able to, as it wouldn't be a fastforward merge.

Now what I would ideally want to have happen is that when Bob pushes his feature branch, the branch will be renamed to bob-feature on the omni-backup remote. And when he pulls feature from omni-backup, he gets back bob-feature.

This doesn't seem terribly easy to accomplish in git. It looks like I can use push hooks documented in http://www.kernel.org/pub/software/scm/git/docs/git-receive-pack.html post-receive hook to rewrite the name of the ref immediately after it written, and then something could be done to reverse the process on the way back, but it feels fragile. Anyone have a better idea?


edit: for VonC (Because code sucks in comments) Your way sounds promising, VonC, but I don't see how the fact that it's a fetch will beat the namespacing problems. Are you suggesting a cronjob that knows how to rename the branch?

like (really dirty):

foreach my $user (@users) {
    my @branches = split(/s/,cat `$LDAPSERVER/$USER/$REPO/.git/refs/heads`);
    foreach my $branch (@branches) {
        system "git fetch $LDAPSERVER/$USER/$REPO/$BRANCH:+$开发者_如何学GoUSER$BRANCH"
    }
}


If you can get the developers to follow certain guidelines, git push can do it correctly.

If you run this command:

git push omni-backup feature:bob-feature

where omni-backup is the remote ref for the repository, then bob's feature branch will push to bob-feature on omni-backup. But if entrusting that to developers is undesirable, switching the direction of flow and having omni-backup pull the developers repositories, as VonC suggests, is the better solution


Why would you need for developer to push to omni-backup repo?

For backup purposes, I would rather register the different developer's repository as remote, and do a git fetch every night (from the omni-backup server) on all the remote repos.
That way, no branch name collusion possible. And a more automated process (the developer doesn't have to explicitly push anything on a repo he/she doesn't directly work with, but would only consider for backup)

Then I would produce a nice little git archive out of omni-backup and store it away.

0

精彩评论

暂无评论...
验证码 换一张
取 消