How to create stable checksums for auto-generated TarGZ archives?_问答_开发者

How to create stable checksums for auto-generated TarGZ archives?

开发者 https://www.devze.com 2023-01-27 20:08 出处：网络

For a build script, I need to work with source packages of a certain version. In order to not having to include big source archives, the scripts just stores their checksums (SHA1) and downloads them a

http://download.videolan.org/pub/videolan/libdca/0.0.5/libdca-0.0.5.tar.bz2

However, some packages don't provide an official release, so I download a well-tested version from the version control system. For instance, Gitweb provides the handy "snapshot" feature for downloading a TarGZ archive:

http://git.videolan.org/?p=libbluray.git;a=snapshot;h=cf9ee593f;sf=tgz

Unfortunat开发者_如何学JAVAely, this URL returns a slightly different file on each request. Although it always returns exactly the same tar archive which is always compressed via gzip in the same way, there is a small difference in the timestamp near the beginning of the gzip archive.

Those few bytes make the checksum differ on each download, so the script can't ensure the integrity of the downloaded source archive anymore.

How can I circumvent this issue?

Just zcat $archive |sha1sum it if the tar is stable. Otherwise, you could check out the correct sha1 using git (maybe with --depth 0), or store pristine-tar deltas that let you rebuild a stable archive.

The zcat solution is appropriate, but if for any reason you are worried that zcat eats CPU for nothing, you can just jump over the 10 byte header at the start of the gzip archive that contains the timestamp (see http://www.gzip.org/zlib/rfc-gzip.html#file-format), and hash the rest.

So tail --byte +10 $archive | sha1sum can be nice instead

Also tail --byte +10 $archive | openssl sha1 could be useful in an environement where you don't have sha1sum