I am attempting to write an update
hook for git that bounces if a submodule is being updated to a commit ID that does not exist in the submodule's upstream repository. To say it another way, I want to force users to push changes to the submodule repositories before they push changes to the submodule pointers.
One caveat:
- I only want to test submodules whose bare, upstream repositories exist on the same server as the parent repository. Otherwise we start having to do crazy things like call 'git clone' or 'git fetch' from within a git hook, which would not be fun.
I have been playing around with an idea but it feels like there must be a better way to do this. Here is what I was planning on doing in the update hook:
- Check the refname passed into the hook to see if we are updating something under
refs/heads/
. If not, exit early. - Use
git rev-list
to get a list of revisions being pushed. - For each revision:
- Call
git show <revision_id>
and use a regular expression that looks to see if a submodule was updated (by searching for `+Subproject commit [0-9a-f]+). - If this commit did change a submodule, get the contents of the
.gitmodules
files as seen by that particular commit (git show <revision_id>:.gitmodules
). - Use the results of 3.1 and 3.2 to get a list of submodule URLs and their updated commit IDs.
- Check this list created in 3.3 against an external file that maps submodule URLs to local bare git repos开发者_如何学Pythonitories on the filesystem.
cd
to the paths found in 3.4 and executegit rev-parse --quiet --verify <updated_submodule_commit_id>
to see if that commit exists in that repository. If it does not, exit with a non-zero status.
- Call
(Note: I believe the results of 3.2 can potentially be cached across revisions as long as the output to git rev-parse --quiet --verify <revision_id>:.gitmodules
doesn't change from one revision to the next. I left this part out to simplify the solution.)
So yeah, this seems pretty complex, and I can't help but wonder if there are some internal git commands that might make my life a lot easier. Or maybe there is a different way to think about the problem?
Edit, much later: As of Git 1.7.7, git-push
now has a --recurse-submodules=check
option, which refuses to push the parent project if any submodule commits haven't been pushed to their remotes. It doesn't appear that a corresponding push.recurseSubmodules
config parameter has been added yet. This of course doesn't entirely address the problem - a clueless user could still push without the check - but it's quite relevant!
I think the best approach, rather than examining each individual commit, is to look at the diff across all of the pushed commits: git diff <old> <new>
. You don't want to look at the whole diff though, really; it could be enormous. Unfortunately, the git-submodule porcelain command doesn't work in bare repos, but you should still be able to quickly examine .gitmodules
to get a list of paths (and maybe URLs). For each one, you can git diff <old> <new> -- path
, and if there is a diff, grab the new submodule commit. (And if you're worried about a 000000 old commit possibility, you can just use git show
on the new one, I believe.)
Once you get all that taken care of, you've reduced the problem to checking whether given commits exist in given remote repositories. Unfortunately, as it looks like you've noticed, that's not straightforward, at least as far as I know. Keeping local, up-to-date clones is probably your best bet, and it sounds like you're good there.
By the way, I don't think the caching is going to be relevant here, since the update hook is once per ref. Yes, you could do this in a pre-receive hook, which gets all the refs on stdin, but I don't see why you should bother doing more work. It's not going to be an expensive operation, and with an update hook, you can individually accept or reject the various branches being pushed, instead of preventing all of them from being updated because only one was bad.
If you want to save some trouble, I'd probably just avoid parsing the gitmodules file, and hardcode a list into the hook. I doubt your list of submodules changes very often, so it's probably cheaper to maintain that than to write something automated.
Here is my little attempt at a git update hook. Documenting it here so that it could be useful to others. Known caveat is that the '0000...' special case is not handled.
#!/bin/bash
REF=$1
OLD=$2
NEW=$3
# This update hook is based on the following information:
# http://stackoverflow.com/questions/3418674/bash-shell-script-function-to-verify-git-tag-or-commit-exists-and-has-been-pushe
# Get a list of submodules
git config --file <(git show $NEW:.gitmodules) --get-regexp 'submodule..*.path' | while read key path
do
url=$(git config --file <(git show $NEW:.gitmodules) --get "${key/.path/.url}")
git diff "$OLD..$NEW" -- "$path" | grep -e '^+Subproject commit ' |
cut -f3 -d ' ' | while read new_rev
do
LINES=$(GIT_DIR="$url" git branch --quiet --contains "$new_rev" 2>/dev/null | wc -l)
if [ $LINES == 0 ]
then
echo "Commit $new_rev not found in submodule $path ($url)" >&2
echo "Please push that submodule first" >&2
exit 1
fi
done || exit 1
done || exit 1
exit 0
精彩评论