开发者

Download and write .tar.gz files without corruption

开发者 https://www.devze.com 2022-12-27 16:15 出处:网络
How do you download files, specifically .zip and .tar.gz, with Ruby and write them to the disk? —This question was originally specific to a bug in MacRuby, but the answers are relevant to the above

How do you download files, specifically .zip and .tar.gz, with Ruby and write them to the disk?

This question was originally specific to a bug in MacRuby, but the answers are relevant to the above general question.

Using MacRuby, I've found that the file appears to be the same as the reference (in size), but t开发者_如何学运维he archives refuse to extract. What I'm attempting now is at: https://gist.github.com/arbales/8203385

Thanks!


I've successfully downloaded and extracted GZip files with this code:

require 'open-uri'
require 'zlib'

open('tarball.tar', 'w') do |local_file|
  open('http://github.com/jashkenas/coffee-script/tarball/master/tarball.tar.gz') do |remote_file|
    local_file.write(Zlib::GzipReader.new(remote_file).read)
  end
end


I'd recommend using open-uri in ruby's stdlib.

require 'open-uri'

open(out_file, 'w') do |out|
  out.write(open(url).read)
end

http://ruby-doc.org/stdlib/libdoc/open-uri/rdoc/classes/OpenURI/OpenRead.html#M000832

Make sure you look at the :progress_proc option to open as it looks like you want a progress hook.


The last time I got currupted files with Ruby was when I forgot to call file.binmode right after File.open. Took me hours to find out what was wrong. Does it help with your issue?


When downloading a .tar.gz with open-uri via a simple open() call, I was also getting errors uncompressing the file on disk. I eventually noticed that the file size was much larger than expected.

Inspecting the file download.tar.gz on disk, what it actually contained was download.tar uncompressed; and that could be untarred. This seems to be due to an implicit Accept-encoding: gzip header on the open() call which makes sense for web content, but is not what I wanted when retrieving a gzipped tarball. I was able to work around it and defeat that behavior by sending a blank Accept-encoding header in the optional hash argument to the remote open():

open('/local/path/to/download.tar.gz', 'wb') do |file|
  # Send a blank Accept-encoding header
  file.write open('https://example.com/remote.tar.gz', {'Accept-encoding'=>''}).read
end
0

精彩评论

暂无评论...
验证码 换一张
取 消