In Ruby, what's the most efficient method for开发者_Go百科 reading giant text files? On the order of 107 lines with 89 bytes/line. Is one method significantly better than another?
I did some benchmarks a while back to see what would be a good way to load a text file. The fastest was to read in blocks of text, then iterate over them using String.lines.
Reading a text file that is 188,593,869 bytes as a baseline:
IO.foreach(ARGV.shift) do |li|
print li
end
time ruby test.rb root.mbox > /dev/null
#
# real 0m3.949s
# user 0m3.709s
# sys 0m0.182s
I dump it to /dev/null to remove screen I/O from the timing.
Instead of reading exclusively line-by-line, load it in a big chunk then iterate over the lines:
File.read(ARGV.shift).lines do |l|
print l
end
time ruby test.rb root.mbox > /dev/null
real 0m3.492s
user 0m3.281s
sys 0m0.209s
That's 0.5 second savings. It also sucked in 188MB of data, which hardly scales well if you have bigger files. The nice thing is you can tell it to load the entire file, which I did, using read()
or tell it to limit the read size.
Here's a cleaned up output from wc
for the text file for your reference:
lines: 2,465,369
words: 26,466,463
bytes: 188,593,869
精彩评论