开发者

git am/format-patch: control format of line endings

开发者 https://www.devze.com 2023-03-11 20:32 出处:网络
I created a patch from three commits using git format-patch <revision_three_commits_ago> This creates three patch files that I mailed from my notebook and read the mail on my desktop computer

I created a patch from three commits using

git format-patch <revision_three_commits_ago>

This creates three patch files that I mailed from my notebook and read the mail on my desktop computer (both are Windows boxes).

When I do now

git am --3way --ignore-space-change *.patch

the patches apply, but I don't get the same SHA1 IDs for the commits. Searching a bit in the patched files, I found that the modified lines on my desktop computer end with LF, whereas the modified lines on the no开发者_如何转开发tebook (where I created the patch) end with CR LF.

So, my first thought was to call git am without --ignore-space-change, but this gives me an error (patch does not apply).

How could I tell git format-patch or git am about how to handle the line endings (msysgit 1.7.4)?

Do I really have to take VIM and change the file format from UNIX to DOS before I can apply the patches?


EDIT: Not even modifying the patch files with VIM helps: I thought, set ff=dos and a :%s/^M//g would help, but it doesn't!

In my opinion, applying a patch should result in exactly the same content and also the same commit hash like I pulled from the other repo where the patch was created. Am I thinking wrong about that?


After playing around with various options (core.autocrlf, core.eol) I found that using

git am --keep-cr

does the trick (but causes a warning about trailing whitespaces).

No manual editing of the patch file or other dirt is neccessary.

But, (of course) the hash is different as described in nikai's answer... Thanks to nikai for pointing me to the hash stuff.

In my notebook-desktop-scenario, I wanted to transfer some changes offline from notebook to the desktop computer, but the repos should not diverge nor should the same commit occur twice when I applied the patch on the desktop and then do a git pull desktop from the notebook.

To achieve this, I did the following:

  1. On the desktop, apply the patch as described above using git am --keep-cr ...
  2. On the notebook, to a git pull desktop, which leads to the situation that each commit introduced by the patch occurs twice (once for the original notebook commit, once for the patched and pulled in desktop commit)
  3. Now (being on the master branch of the notebook), issuing a git rebase desktop/master leads to No changes -- Patch already applied message and kicks out the original notebook commits replaced by the desktop commits


Git 2.3.0 (February 2015) will propose another new option: --transfer-encoding in order to specify the transfer encoding to use (quoted-printable, 8bit, base64), instead of relying only on --keep-cr.

git send-email man page.
git am man page.

See commit 8d81408 by Paolo Bonzini (bonzini):

git-send-email: add --transfer-encoding option

The mailing-list thread details problems when applying patches with "git am" in a repository with CRLF line endings.
In the example in the thread, the repository originated from "git-svn" so it is not possible to use core.eol and friends on it.

Right now, the best option is to use "git am --keep-cr".
However, when a patch create new files, the patch application process will reject the new file because it finds a "/dev/null\r" string instead of "/dev/null".

The problem is that SMTP transport is CRLF-unsafe.
Sending a patch by email is the same as passing it through "dos2unix | unix2dos".
The newly introduced CRLFs are normally transparent because git-am strips them. The keepcr=true setting preserves them, but it is mostly working by chance and it would be very problematic to have a "git am" workflow in a repository with mixed LF and CRLF line endings.

The MIME solution to this is the quoted-printable transfer enconding.
This is not something that we want to enable by default, since it makes received emails horrible to look at.
However, it is a very good match for projects that store CRLF line endings in the repository.

The only disadvantage of quoted-printable is that quoted-printable patches fail to apply if the maintainer uses "git am --keep-cr".
This is because the decoded patch will have two carriage returns at the end of the line.
Therefore, add support for base64 transfer encoding too, which makes received emails downright impossible to look at outside a MUA (Mail User Agent), but really just works.

The patch covers all bases, including users that still live in the late 80s, by also providing a 7bit content transfer encoding that refuses to send emails with non-ASCII character in them.
And finally, "8bit" will add a Content-Transfer-Encoding header but otherwise do nothing.

The doc for git send-email will now include:

--transfer-encoding=(7bit|8bit|quoted-printable|base64)

Specify the transfer encoding to be used to send the message over SMTP.
7bit will fail upon encountering a non-ASCII message.

Quoted-printable can be useful when the repository contains files that contain carriage returns, but makes the raw patch email file (as saved from a MUA) much harder to inspect manually.

Default is the value of the 'sendemail.transferEncoding' configuration value; if that is unspecified, git will use 8bit and not add a Content-Transfer-Encoding header.


With Git 2.32 (Q2 2021), "git mailinfo"(man) (hence "git am"(man)) learned the "--quoted-cr" option to control how lines ending with CRLF wrapped in base64 or qp are handled.

See commit 59b519a, commit 133a4fd, commit f1aa299, commit 0b68956 (10 May 2021), and commit dd9323b, commit d582992 (06 May 2021) by Đoàn Trần Công Danh (sgn).
(Merged by Junio C Hamano -- gitster -- in commit 483932a, 16 May 2021)

mailinfo: warn if CRLF found in decoded base64/QP email

Signed-off-by: Đoàn Trần Công Danh

When SMTP servers receive 8-bit email messages, possibly with only LF as line ending, some of them decide to change said LF to CRLF.

Some mailing list softwares, when receive 8-bit email messages, decide to encode those messages in base64 or quoted-printable.

If an email is transfered through above mail servers, then distributed by such mailing list softwares, the recipients will receive an email contains a patch mungled with CRLF encoded inside another encoding.

Thus, such CR (in CRLF) couldn't be dropped by "mailsplit".
Hence, the mailed patch couldn't be applied cleanly.
Such accidents have been observed in the wild.

Instead of silently rejecting those messages, let's give our users some warnings if such CR (as part of CRLF) is found.

The warning will be:

warning: quoted CRLF detected


There was a similar question here: apparently same commits give different sha1, why?

As a short recap, git cat-file commit <sha> should be able to narrow down whether tree, parents, emails, dates, names from author or committer differ, or if an extra '\n' was introduced in the commit messages.


Short answer: git am --committer-date-is-author-date

Long story: I just discovered that I was in the same boat while trying to sneakernet commits between two repositories. After experimenting with git am's options, I noticed that the file IDs always matched, but that consecutive runs of git am were producing different commit IDs for the same options. It turns out that there are two timestamps -- the "author" timestamp you normally see, and a "committer" timestamp when the commit is created. The latter is set to the current time by default in git am. You need the --committer-date-is-author-date option to keep the dates in sync, which in turn will get your commit IDs in sync.

Because of this, I would say that git bundle is much more reliable if that is an option in your environment.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号