A colleague of mine thinks that HDFS has no maximum file size, i.e., by partitioning into 128 / 256 meg chunks any file size can be stored (obviously the HDFS开发者_高级运维 disk has a size and that will limit, but is that the only limit). I can't find anything saying that there is a limit so is she correct?
thanks, jim
Well there is obviously a practical limit. But physically HDFS Block IDs are Java longs so they have a max of 2^63 and if your block size is 64 MB then the maximum size is 512 yottabytes.
I think she's right about saying there's no maximum file size on HDFS. The only thing you can really set is the chunk size, which is 64 MB by default. I guess sizes of any length can be stored, the only constraint could be that the bigger the size of the file, the greater the hardware to accommodate it.
I am not an expert in Hadoop, but AFAIK, there is no explicit limitation on a single file size, though there are implicit factors such as overall storage capacity and maximum namespace size. Also, there might be administrative quotes on number of entities and directory sizes. The HDFS capacity topic is very well described in this document. Quotes are described here and discussed here.
I'd recommend paying some extra attention to the Michael G Noll's blog referred by the last link, it covers many hadoop-specific topics.
精彩评论