I'm using the开发者_JS百科 split linux command to split huge xml files into node-sized ones. The problem is now I have directory with hundreds of thousands of files.
I want a way to get a file from the directory (to pass to another process for import into our database) without needing to list everything in it. Is this how Dir.foreach
already works? Any other ideas?
You can use Dir.glob
to find the files you need. More details here, but basically, you pass it a pattern like Dir.glob 'dir/*.rb'
and get back filenames matching that pattern. I assume it's done in a reasonably good way, but it will depend on your platform and implementation.
As to Dir.foreach
, this should be efficient too - the concern would be if it has to process the entire directory for every pass around the loop. But that would be awful implementation, and is not the case.
精彩评论