开发者

tcp/ip accept not returning, but client does

开发者 https://www.devze.com 2023-01-02 15:10 出处:网络
server: vxworks 6.3 calls the usual socket, bind, listen, then: for (;;) { client = accept(sfd,NULL,NULL);

server:

vxworks 6.3

calls the usual socket, bind, listen, then:

for (;;)
{
  client = accept(sfd,NULL,NULL);
  // pass client to worker thread
}

client:

.NET 2.0

TcpClient constructor to connect to server that takes the string hostname and int port, like:

TcpClient client = new TcpClient(server_ip, port);

This is working fine when the server is compiled and executed in windows (native c++).

intermittently, the constructor to TcpClient will return the instance, without throwing any exception, but the accept call in vxWorks does not return with the client fd. tcpstatShow in开发者_如何学运维dicates no accept occurred.

What could possibly make the TcpClient constructor (which calls 'Connect') return the instance, while the accept call on the server not return? It seems to be related to what the system is doing in the background - it seems more likely to get this symptom to occur when the server is busy persisting data to flash or an NFS share when the client attempts to connect, but can happen when it isn't also.

I've tried adjusting priority of the thread running accept

I've looked at the size of the queue in 'listen'. There's enough.

The total number of file descriptors available should be enough (haven't validated this yet though, first thing in the morning)


would it be possible for you to post a wireshark/netmon of what is happening on the wire?


It could be many reasons, however we won't know unless we can get more information from the server and client side. Does it throw out any errors? A list of TCP/IP errors can be found here Windows Socket Error. On the server side, are you catching any exceptions? Maybe you can try closing the connection (with linger of 1 second) after it has an error?


Is it possible to bind the server on another port and see if it accepts there? If the client returns it sounds like it's getting an accept from something on your server. I do not know about vxworks but in Windows you should always try to not bind to anything under 1000.


Your server's accept() call looks wrong. The POSIX accept() call that I know has:

int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen); 

where *addr is a required pointer that gets written to if the call works—indeed, one of the failure states for the call is:

[EFAULT]    The address parameter is not in a writable part of the user address space.

I haven't done Windows socket programming, but I understand it's POSIX-compliant, and Beej's guide doesn't mention any exceptions for Windows for accept(), so this should still apply. Somewhat relevant, the Python accept() call also 'returns' the address field (I say somewhat since Python did its best to emulate the C networking API as it made sense.)

I would suggest checking errno and using perror after the accept call in the server, to see if [EFAULT] is set (it will also inform you if you ran out of descriptors, as errno gets set to [EMFILE] or [ENFILE])

If that doesn't prove to be the issue, use ncat, as either server or client, to investigate further. I'd run it with -vv since you want to know exactly when connections are made, what's sent etcetera.

0

精彩评论

暂无评论...
验证码 换一张
取 消