开发者

I need Multi-Part DOWNLOADS from Amazon S3 for huge files

开发者 https://www.devze.com 2023-02-06 10:13 出处:网络
I know Amazon S3 added the multi-part upload for huge files. That\'s great.What I also need is a similar functionality on the client side for customers who get 开发者_JAVA技巧part way through download

I know Amazon S3 added the multi-part upload for huge files. That's great. What I also need is a similar functionality on the client side for customers who get 开发者_JAVA技巧part way through downloading a gigabyte plus file and have errors.

I realize browsers have some level of retry and resume built in, but when you're talking about huge files I'd like to be able to pick up where they left off regardless of the type of error out.

Any ideas?

Thanks, Brian


S3 supports the standard HTTP "Range" header if you want to build your own solution.

S3 Getting Objects


I use aria2c. For private content, you can use "GetPreSignedUrlRequest" to generate temporary private URLs that you can pass to aria2c


S3 has a feature called byte range fetches. It’s kind of the download compliment to multipart upload:

Using the Range HTTP header in a GET Object request, you can fetch a byte-range from an object, transferring only the specified portion. You can use concurrent connections to Amazon S3 to fetch different byte ranges from within the same object. This helps you achieve higher aggregate throughput versus a single whole-object request. Fetching smaller ranges of a large object also allows your application to improve retry times when requests are interrupted. For more information, see Getting Objects.

Typical sizes for byte-range requests are 8 MB or 16 MB. If objects are PUT using a multipart upload, it’s a good practice to GET them in the same part sizes (or at least aligned to part boundaries) for best performance. GET requests can directly address individual parts; for example, GET ?partNumber=N.

Source: https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/use-byte-range-fetches.html


Just updating for current situation, S3 natively supports multipart GET as well as PUT. https://youtu.be/uXHw0Xae2ww?t=1459.


NOTE: For Ruby user only

Try aws-sdk gem from Ruby, and download

object = AWS::S3::Object.new(...)
object.download_file('path/to/file.rb')

Because it download a large file with multipart by default.

Files larger than 5MB are downloaded using multipart method

http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Object.html#download_file-instance_method

0

精彩评论

暂无评论...
验证码 换一张
取 消