I have an apparently "simple" problem but I can't find the solution for some reason...
I have n millions files of different sizes and I want to find the average filesize. To simplify it, I grouped them in multiples of 16KB.< 16 KB = 18689546 files
< 32 KB = 1365713 files < 48 KB = 1168186 files ...Of course, the simple (total_size / number of files) does not work. It gives a开发者_StackOverflow社区n average of 291KB...
What would be the algorithm to calculate the real average...?Thx, JD
You might be running into a problem with overruns when summing the file sizes (the total size probably doesn't fit into a 32-bit value). The easiest fix might be to try using a 64-bit int for the variable that's holding the sum.
精彩评论