开发者

Prune binary data from a git repository after the fact

开发者 https://www.devze.com 2023-02-01 16:03 出处:网络
I accidentally committed some large binary data into some commits. Since then I\'ve updated my .gitignore, and those files are no longer being committed. But I\'d like to go back into the older commit

I accidentally committed some large binary data into some commits. Since then I've updated my .gitignore, and those files are no longer being committed. But I'd like to go back into the older commits and selectively prune out this data from the repository, removing a couple directories that should have been in .gitignore. I don't want to remove the commits themselves.

How would I go about accomplishing this? My preferred method would be some way to retroactively apply the .gitignore rules to old commits... an answer that uses this method would also be pretty generally useful to others, since I'm sure my problem is not unique. It would also be quick to apply to a general solution, without lots of customization specific to each user's unique directory structure.

Is this possible, either the ea开发者_JS百科sy way I suggest above, or in some more complicated manner?


The solution in this answer worked perfectly for me:

You can also test your clean process with a tool like bfg repo cleaner, as in this answer:

java -jar bfg.jar --delete-files *.{jpg,png,mp4,m4v,ogv,webm} ${bare-repo-dir};

(Except BFG makes sure it doesn't delete anything in your latest commit, so you need to remove those files in the current index and make a "clean" commit. All other previous commits will be cleaned by BFG)


A (relatively) new tool was released that replaces the git filter-branch function that used to be the best answer to this question. git filter-repo is a Python tool that can handle nearly any history revisioning you need to do in git.

For this example (removing specific folders or files from a repo) I could run the command like this:

git filter-repo --path bin --path-glob '*.tar.gz' --invert-paths

This will filter out any content that's in the given folder or matches the given glob pattern. Like any tool that revises git history you should either try and catch this early before your commits are shared with others or be very familiar with git-rebase and recovering from difficult changes.

0

精彩评论

暂无评论...
验证码 换一张
取 消