We use IBM WebSphere MQ for SWIFT messages. When a SWIFT message is received, it is processed and put into Local Queues as the processing goes on. Its like the follows :
Outside World > Q1 > App > Q2 > App > Q3 > App
The queues are local queues. But there has been considerable delay when the mess开发者_如何学编程age reaches the Application from Q1/Q2/Q3 ... like days. And this happens arbitrarily. We have no clue as to why this is happening. Most of the messages get thru pretty quick but there are a couple of them in 3-4 days which arrive late.
All this happens in a transaction and we use Atomikos as our Transaction Manager.
Has anybody faced a similar issue before ? Any help is appreciated.
Thanks, Midhun.
There are a number of ways in which WebSphere MQ messages can be delayed and diagnosing may take a little detective work. Here are a few of the more common causes:
- Message stuck under syncpoint. Although it would be unusual for a message to sit under syncpoint for days, I've seen it happen. The issue is that some applications are designed to batch up several messages in a single transaction and when messages do not arrive in a batch multiple, the remainder of the messages sit and wait for another message(s) to arrive and close the batch.
- Message stuck under syncpoint. In another case, the program logic does not commit a syncpoint until the next message is read. When several threads are processing messages, distribution of load is not necessarily uniform across all threads and one can be starved for messages if the load is light.
- Message orphaned by browse. In this scenario, a message arrives at a higher priority than the current browse cursor. If the rescan interval is set extremely high and traffic volume also high, it may take a while before the browse cursor is reset to the top of the queue.
- Program error. You didn't mention which version of WMQ client and server you are using (hopefully both at 7.0!) but occasionally there are issues that cause threads to hang. These can tie up a message under syncpoint. It is always a good idea to go to the latest FixPack for your version and check the link called "Problems fixed in..." to see if an APAR addresses your specific issue. If so, apply the latest Fix Pack.
To start diagnosing this, use the DIS QSTATUS
command to display the number of input and output handles on the queue, the message age and any outstanding units of work. You can also use the exit in SupportPac MA0W to get a human-readable trace of all API calls on a given queue. This can be an extremely valuable diagnostic tool because you can tell exactly how long a message sat under syncpoint, whether it is being continuously backed out and re-read, what options are used for the API call, etc. You can even limit the trace to specific queues or specific threads which is helpful if you need to let it run for a while.
精彩评论