What is best, a Single-threaded or a multi-threaded server?_问答_开发者

I have to create a simple client<->server communication to transfer files using C language (Linux).

The server accept connections on the 10000 port, I don't know if is better to create a new thread each r开发者_StackOverflowequest or create a fixed numbers of threads and use asynchronous technique.

CASE A:

client --> server --> (new thread) --> process the request

CASE B:

SERVER --> create thread 1 - thread 2 - thread 3

then

client1 --> server --> thread 1
client2 --> server --> thread 2
client3 --> server --> thread 3
client4 --> server --> thread 1
client5 --> server --> thread 2
client6 --> server --> thread 3

In this case thread 1 could process many client's requests

My considerations:

CASE 1: Is faster but waste a lot of memory

CASE 2: Is slower but use a low memory

Am I wrong?

If you consider checking architecture of widely used http servers ( nginx, lighttpd, apache ) you'll notice, that ones using fixed thread count ( so called "worker threads", their amount should depend on processsor count on server ) are a lot faster then one using large thread pool. However, there are very important moments:

"Worker thread" implementation should not be as straightforward as it is tempting, or there will be no real performance gain. Each thread should implement each one pseudo concurrency using state machine, processing multiple requests at time. No blocking operations can be allowed here - for example, the time in thread that is wated to wait for I/O from hard drive to get file contents can be used to parse request for next client. That is pretty difficult code to write, though.
Thread-based solution ( with re-usable thread pool, as thread creation IS heavyweight operation ) is optimal when considering performance vs coding time vs code support. If your server is not supposed to handle thousands of requests per second, you'll get ability to code in pretty natural blocking style without risking to fail in performance completely.
As you can notice, "worker thread" solutions itself service only statical data, they proxy dynamic script execution to some other programs. As far as I know ( may be wrong ), that is due to complexities with non-blocking processing of request with some unknown dynamic stuff executed in their context. That should not be an issue in your case, anyway, as you speak about simple file transfer.

The reason why limited thread solution is faster on heavy-load systems - thread http://en.wikipedia.org/wiki/Context_switch is pretty costful operation, as it requires saving data from registers and loading new one, as long as some other thread-local data. If you have too many threads compared to process count ( like 1000x more ), a lot of time in your application will be wasted simply switching between threads.

So, short answer to your question is: "No, it has nothing to do with memory usage, choice is all about type of data served, planned request/second count and ability to spend a lot of time on coding".

There's no right answer here. Depends on a lot of things. And you need to choose by yourself.

"CASE 1: Is faster but waste a lot of memory
"CASE 2: Is slower but use a low memory"

Wrong. Depends on a lot of things. Creating threads is not that expensive (it is, but not that much), but if the threads got too many, you'll have a problem.

This depends on the load very much - what is the expected load? If it is, lets say, about 1000 requests per second - you know, if you create 1000 threads each second..... this will be disaster :D

Also - create as many threads, as the CPU will be able to handle, without (much) switching between them. There's a big chance (depends on your program, of course), a single-core CPU to work much, much slower with 10 threads, instead of 1(or 2). This really depends on what will these threads do, too.

I'd choose to create a thread pool and reuse the threads.

My first choice would be to do it single-threaded using select(2). If that wasn't good enough performance-wise I'd go with a thread-pool solution. It will scale better.

There are times where creating one thread per client is perfectly ok. I've done that and it worked well for that application, with usually around 100 client up to a maximum of 1000 clients. That was 15 years ago. Today the same application can probably handle 10000 clients due to better hardware. Just be aware that one thread per client doesn't scale very well.

I know it has been quite a while since you asked this, but here's my take on your question from a perspective of someone who already wrote a handful of servers in C.

If your server is your own, totally dependent, and non-dependent on others codes, I would highly recommend that you do it single-threaded with non-blocking sockets and using epoll (Linux), kqueue (BSD) or WSAEventSelect (Windows).

This may require that you split down a code that would have been otherwise "simple" to much smaller chunks, but if scalability is what you're looking after, this will beat any thread-based / select based servers.

There was a great article once called "The C10K Problem" that is focused entirely around the problem of how to handle 10,000 concurrent connections. I actually learned alot myself from it, Here's the link to it: http://www.kegel.com/c10k.html .

There is also another great article that focus around scalability called "Scalable Networking" that you can find here: http://bulk.fefe.de/scalable-networking.pdf .

Those two are great reads, hope that helps.

This is entirely down to you. There is no right or wrong answer. You've identified the pros and cons of both already and you're right with both of those; 1 is faster but more intensive, 2 is slower because clients may have to wait.

I would go with the pool of pre created threads and re-use them when they are done with the current request they're handling. Creating threads can be expensive, as it mostly involves calls into the kernel.

There is "threadpool" type project here using pthreads. Perhaps you can get some ideas from there on how to implement.

It really depends on what your server is doing.

I would recommend that you do the simplest thing possible. This is probably a single-process model, which multiplexes all available connections using, select, poll, libevent or similar.

That is, if you're using TCP.

If you use UDP, it's even easier as the application can do everything with one socket, so it can (possibly) use a blocking socket.