开发者

Why did --cached option on filter-branch remove files from working directory?

开发者 https://www.devze.com 2023-03-26 16:43 出处:网络
I needed to remove some Xcode files from an old repo that should have been ignored. So I ran the following command

I needed to remove some Xcode files from an old repo that should have been ignored. So I ran the following command

git filter-branch --index-filter 'git rm -f --cached --ignore-unmatch *mode1v3 *pbxuser' HEAD

My understanding was that adding --cached would not affect the current working directory, but git deleted those matching files too. Luckily I had开发者_C百科 a backup(!) but I'm curious why it does this, or am I misunderstanding what --cached does?


The culprit is not the git rm command. Its --cached option works indeed as you say. You can easily try that in a small git repo.

Although the man page does not mention it, git filter-branch does not seem to preserve your working area. Actually the command refuses to run if your working area is not clean, which is an indication already.

But even if the files are gone from the working area, they are not gone from the repo. They are just no longer in any commit reachable in your current branch. But filter-branch stores are reference to your branch before rewriting to reference name space refs/original/.

Use command git show-ref to see it.

You could check out the old version to access your removed files. You could use command git cat-file blob refs/original/refs/heads/master:foo to get the contents of the file without checking out (use the reference shown by show-ref, foo is the name of the desired file). There are plenty of possibilities

You can use gitk --all to navigate through both your rewritten and your current branches and you will see that nothing is really gone.


The behaviour of git-filter-branch can be surprising, as you've discovered - and it won't protect you from unintended consequences when you run it.

Instead I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative specifically designed for deleting files from Git history. One way in which it makes your life easier here is that it will not delete, or change in any way, files in your latest commit.

You should follow the usage instructions - but the core bit is just this: download the BFG's jar (requires Java 6 or above) and run this command:

$ java -jar bfg.jar  --delete-files *{mode1v3,pbxuser}  my-repo.git

Any file matching that expression in your repository history - which isn't also in your latest commit - will be deleted. You can then use git gc to clean away the dead data:

$ git gc --prune=now --aggressive

The BFG is generally much simpler to use than git-filter-branch - the options are tailored around these two common use-cases:

  • Removing Crazy Big Files
  • Removing Passwords, Credentials & other Private data

Full disclosure: I'm the author of the BFG Repo-Cleaner.

0

精彩评论

暂无评论...
验证码 换一张
取 消