开发者

How to write (large) files with Ruby Eventmachine

开发者 https://www.devze.com 2023-02-03 15:25 出处:网络
I\'ve spent several days now finding some non-echo-server examples for eventmachine, but there just don\'t seem to be any. Let\'s say i want to write a server that accepts a file and writes it to a Te

I've spent several days now finding some non-echo-server examples for eventmachine, but there just don't seem to be any. Let's say i want to write a server that accepts a file and writes it to a Tempfile:

require 'rubygems'
require 'tempfile'
require 'eventmachine'

module ExampleServer

  def receive_data(data)
    f = Tempfile.new('random')
    f.write(data)
  ensure
    f.close
  end

end

EventMachine::run {
  EventMachine::start_server "127.0.0.1", 8081, ExampleServer
  puts 'runn开发者_高级运维ing example server on 8081'
}

Writing to the file would block the reactor, but i don't get how to do it 'Eventmachine style'. Would i have to read the data in chunks and write each chunk to disk within an Em.next_tick block?

Thanks for any help Andreas


Two answers:

Lazy answer: just use a blocking write. EM is already handing you discrete chunks of data, not one gigantic string. So your example implementation may be a bit off. Are you sure you want to make a new tempfile for every single chunk that EM hands you? However, I'll continue on the assumption that your sample code is working as intended.

Admittedly, the lazy approach depends on the device you're writing to, but trying to simultaneously write several large streams to disk at the same is going to be a major bottleneck and you'll lose your advantages of having an event based server anyway. You'll just end up with juggling disk seeks all over the place, IO performance will plummet, and so will your server's performance. Handling many things at once is okay with RAM, but once you start dealing with block devices and IO scheduling, you're going to run into performance bottlenecks no matter what you're doing.

However, I guess you might want to do some long writes to disk at the same time that you want low latency responses to other, non-IO heavy requests. So, perhaps the good answer:

Use defer.

require 'rubygems'
require 'tempfile'
require 'eventmachine'

module ExampleServer

  def receive_data(data)
    operation = proc do
      begin
        f = Tempfile.new('random')
        f.write(data)
      ensure
        f.close
      end
    end

    callback = proc do
      puts "I wrote a file!"
    end

    EM.defer(operation, callback)
  end

end

EventMachine::run {
  EventMachine::start_server "127.0.0.1", 8081, ExampleServer
  puts 'running example server on 8081'
}

Yes, this does use threading. It's really not that bad in this case: you don't have to worry about synchronization between threads, because EM is nice enough to handle this for you. If you need a response, use the callback, which will be executed in the main reactor thread when the worker thread completes. Also, the GIL is something of a non-issue for this case, since you're dealing with IO blocking here, and not trying to achieve CPU concurrency.

But if you did intend to write everything to the same file, you'll have to be careful with defer, since the synchronization issue will arise as your threads will likely attempt to write to the same file at the same time.


From the docs, it seems you just need to attach the file (although as you point out, that might not be valid, it seems the option is to use File.write/ie blocking...) and send_data .

Although I thought you couldn't mix blocking/non-blocking IO with EM :(

Given the source data is a socket, I guess that will be handled by EventMachine .

Perhaps a question for the google group...

~chris


This is very similar to What is the best way to read files in an EventMachine-based app? (but I wanted to know how to read files efficiently). There doesn't seem to be any non-blocking file API, so the best you can do is to write short bursts with next_tick or defer the writing (with defer) so that it runs in a separate thread (but I don't know about how performant that solution is).


Unfortunately files don't respond well to select interfaces. If you need something more efficient than IO#write (which is unlikely), then you could use EIO.

EIO will really only lightly unblock the reactor, and provide you with a teeny bit of buffering. If specific latencies are a problem, or you have really slow disks, that might be helpful. In most other cases, it's probably just a bunch of effort for little advantage.

0

精彩评论

暂无评论...
验证码 换一张
取 消