I am desperate to get some help on this one.
We are getting issues when a volume of messages are sent through TCP Socket Connection on a different server for HSM Card validation.
This is what we tested for volume testing: We tested with processing 3 blocks of 30,000 transactions at 15 trans/sec and at the third block transactions started to be rejected.
The situation was:
- We processed block 1 and 2 with 30,000 transactions successfully.
- At the 开发者_如何学编程third block the system just processed 8,000 transactions successfully and after that the connections with HSM were blocked.
- We saw all HSM sockets were in used so transactions were rejected.
We think some of the sockets are not closed or getting timed out because of the volume of messages
Below is the gist of teh code.
; Open the device
OPEN SOCK:(CONNECT=HOST_":"_PORT_":TCP":DELIMITER=$C(13,10,58,27,95):ATTACH="HSMCLIENT"):TIMOUT:"SOCKET"
ELSE SET ER=1 CLOSE SOCK QUIT "-1"
; Use the socket
USE SOCK
; Write the request. The request message is packed and the bytesteam is written
WRITE HREQ,#
; Read the first two bytes from the socket to identify the length of the reponse
READ BRESP#2:TIMOUT ELSE SET ER=1 CLOSE SOCK QUIT "-1"
; Calculate the length of the incoming data
SET RESPLEN=$A(BRESP,1)_$A(BRESP,2)
; Now read the data of the calculated length
READ BRESP#RESPLEN:TIMOUT ELSE SET ER=1 CLOSE SOCK QUIT "-1"
; Cleanup
CLOSE SOCK
#ENDBYPASS
If you can provide any suggestion or recomendation that will be really appreciated.
Thanks
You're most likely leaving the sockets in TIME_WAIT
on the HSM as you're initiating the active close.
Due to the rate of connection establishment and the duration of the TIME_WAIT
period and the finite number of ephemeral ports you eventually run out of available ports and can't accept any more connections.
You may be able to avoid the HSM's socket going into TIME_WAIT by aborting your connection on completion of your transaction (sending an RST
rather than a FIN
). You do this in code by setting the Linger
option to false before closing your connection. Alternatively your HSM may have a command that you can send that means "thanks for that, I'm done, please close the connection" which would allow it to initiate the active close and then move the TIME_WAIT to the client machines (this may not help if you have a single client machine as you'll just switch the problem from the HSM not being able to accept more connections to you not being able to initiate them...).
精彩评论