Parial Write for sockets in LINUX_问答_开发者_运维开发者技术经验分享

We have a server-client communication in our application. Sockets are used for the communication. We are using AF_INET sockets with SOCK_STREAM(TCP/IP). Also these sockets are in Non Blocking mode (O_NONBLOCK). Application is written in C++ on UNIX.

In our system the Server will write to a socket and Client will read from it. We had written code to handle partial writes. If a partial happens, we will try 30 more times to write the entire data.

Our Server try to write 2464 bytes to the socket. In some cases it could not write entire data. So server will try writing 30 more times to transfer entire data. Most of the times the entire data will be written within 30 tries. But some times the even after 30 reties sever wil not be able to write the entire data. Here it will throw EAGAIN error. Problem happens in the Client Side when it tries to read this partially written data.

Consider the server tried to write 2464 bytes. But after the repeated 30 attempts it could write only 1080 bytes. Server will raise a EAGAIN at this point. Client try to read 2464 bytes. The read command will return 2464 and hence the read itself is ok. But the data we received is a corrupted one (Partially written data only). So client crashes.

Can any one please advise on following,

1) Is it possible to remove only the partially written data by the server itself. Thus the client will not recieve corrupted incomplete data?. (We cannot use a read() function from server to remove this. Consider server successfully written n messages to the socket. Client is in busy state and not able to read them. Then the server try to write the n+1 th message and Partial write occured. If we use read command from the server, the entire n successfull messages alo get removed. We need to remove the Partially witten (n+1 th) message only)

2) Is there any way to identify in client side that we had read a partially written message?.开发者_如何学Python

Please note that we are facing the partial write issue in LINUX(REDHAT 5.4) only. System is working fine in Solaris (In solaris either eh entire data will be written OR NO data witll be written with in 30 tries of write).

Thanks in advance.

There's something terribly wrong in your code.

you should call write as many times as necessary to transfer all the data you want, I see no reason to stop after 30 times
if you're using non-blocking sockets, you should probably use select() (or poll(), or anything similar) to get notified when you can write more data
there's something wrong with the receiving end - if you've sent less than 2464, you shouldn't be able to read that amount from the client socket. Do you check the value returned from read() (i.e. number of bytes read)? Again, on the client side you should use select() etc. and call read as many times as needed to receive full message.

What you're seing is normal behavior for a non-blocking socket. When the buffers(both local and remote) become full, you get partial writes.

You should should not give up after 30 tries that yields EAGAIN/EWOULDBLOCK, but keep trying. You should use select()/poll() or similar to get notified when you can resume writing, or you should just use blocking calls. That you see different results on Solaris vs RHEL is just (un)luck.

No, you'd have to close the connection in such a case, and have the client deal with the partial data.
No, not unless you close the TCP connection. (*)

If you always send messages of 2464 bytes, you're probably ok - but keep in mind that TCP is a stream, it does not deal with "messages"

(*) Technically there's many ways, but requires substantial effort. e.g. you could implement HDLC like frames on top of TCP yourself, where messages are delimited by a special bit pattern. The user data would have to be escaped (bit-stuffing) to not contain that special bit pattern