I need to create specialized HTTP server, for this I plan to use epoll sycall, but I want to utilize multiple processors/cores and I can't come up with architecture solution. ATM my idea is followng: create multiple threads with own epoll descriptors, main thread accepts connections and distributes them among threads epoll. But are there any better solutions? Which books/articles/guides can I read on high load architectures? I've seen only C10K article, but most links to examples are dead :( and still no in-depth books on this subject :(.
Thank you for answers.
UPD: Please be more specific, I need materials and examples (nginx is not an example beca开发者_C百科use its too complex and has multiple abstraction layers to support multiple systems).
check libevent and libev sources. they're highly readable, and already a good infrastructure to use.
Also, libev's documentation has plenty of examples of several tried and true strategies. Even if you prefer to write directly to epoll()
, the examples can lead to several insights.
..my idea is followng: create multiple threads with own epoll descriptors, main thread accepts connections and distributes them among threads epoll.
Yes that's currently the best way to do this and it's how Nginx does it. The number of threads can be increased or decreased depending on load and/or the number of physical cores on the machine.
The trade-off between extra threads (more than the number of physical cores) and events is one of latency and throughput. Threads improve latency because they can execute pre-emptively but at the expense of throughput due to overhead incurred by context switching and thread creation/deletion. Events improve throughput but has the disadvantage that long-running code causes the entire thread to halt.
The second best is how Apache2 does it using a thread pool of blocking threads. No event processing here so the implementation is simpler and the pool means threads are not created and destroyed unnecessarily but it can't really compete with a well implemented thread/asynchronous hybrid like what you're trying to implement or Nginx.
The third best is asynchronous event processing alone like Lighttpd or Node.js. Well, it's the second best if you're not doing heavy processing in the server. But as mentioned earlier, a single long-running while loop blocks the entire server.
Unless you have a terabit uplink and plan to service 10000 simultaneous connections off a single server, forget about epoll
. It's just gratuitous non-portability; poll
or even select
will do just as well. Keep in mind that by the time terabit uplinks and such are standard, your server will also be sufficiently faster that you still won't need epoll
.
If you're just serving static content, forget about threads too and use the Linux sendfile
syscall. This too is nonstandard, but at least it offers huge real-world performance benefits.
Also note that other design decisions (especially excess complexity) will be much more of a factor in how much load your server can handle. For an example, just look how the modest single-threaded, single-process thttpd
blows away Apache and friends in performance on static content -- and in my experience, even on traditional cgi dynamic content!
精彩评论