I am interfacing Node.JS with a library that provides an iterator-style access to data:
next = log.get_next()
I effectively want to write the following:
while (next = log.get_next()) {
console.log(next);
}
and redirect stdout
to a file (e.g. node log.js > log.txt
). This works well for small logs, but for large lots the output file is empty and my memory usage goes through the roof.
It appears I don't fully understand I/O in node, as a simple infinite loop that writes a string to the console also exhibits the same behavior.
Some advice on how to accomplish this ta开发者_StackOverflow社区sk would be great. Thanks.
The WriteStream class buffers i/o and if you're never yielding the thread, the queued writes never get serviced. The best approach is to write a reasonable chunk of data, then wait for the buffer to clear before writing again. The WriteStream class emits a 'drain' event that tells you when the buffer has been fully flushed. Here's an example:
var os = require('os');
process.stdout.on('drain', function(){
dump();
});
function dump(){
for (var i=0; i<10000; i++)
console.log('xxxx');
console.error(os.freemem());
}
dump();
If you run like:
node testbuffer > output
you'll see that the file grows periodically and the memory reaches a steady state.
The library you're interfacing with ought to accept a callback. Node.js is designed to be non-blocking. I think that perhaps console.log
keeps returning control to the loop (and log.get_next()
) before it sends the output.
If the module was rewritten to make get_next support a callback, improved code might be like this:
var log_next = function() {
console.log(next);
log.get_next(log_next);
};
log.get_next(log_next);
(There are libraries and patterns that could make this code prettier.)
If the code is only synchronous and has to stay as it is, calling setTimeout with 0 or another small number could keep it from blocking the entire process.
var log_next = function() {
console.log(log.get_next());
setTimeout(log_next, 0);
};
log_next();
精彩评论