I've been running Service Broker in my development environment for a few months now and have had perfectly adequate performance, toward 1000 message per second (plenty for my needs).
I've also been running on a cut-down replica of my real production environment which involves a forwarding instance, and for the 1st time today pushed some load through it with terrible results! I'm trying to understand what I've been seeing, but am struggling a bit so I thought I'd put it out to see if anyone can help.
Firstly, messages are being delivered from start, to end, through the forwarder. However when I pushed a few thousand messages, I saw batches of between 20 to 100 being sent followed by delays of a minute or two. The messages are ultimately processed successfully.
Looking at the queue on the Store (the initial sender) there are thousands of messages sat waiting to be forwarded which are trickling out.
The security setup goes like:
Store database -> Certificate -> Forwarding instance -> Windows Security -> Central database
When I switch on profilers I am seeing lots errors:
Some examples on the forwarding instance:
7 - Send IO Error (10054(failed to retrieve text for this error. Reason: 15105))
Forwarded Message Dropped (The forwarded message has been dropped because a transport send error occurred when sending the message. Check previous events for the error.)
And on my 'central' target instance:
A corrupted message has been received. The binary message preamble is malformed.
Broker message undeliverable This message was dropped because it could not be dispatched on time. State: 2
Can anybody help by pointing me towards some checks I could make, or maybe something obvious that I've missed. I know I've got开发者_JS百科 something wrong but just can't see what.
Edit - 14/1/2011 - more information: Some more information on this - we took our message forwarding instance out of the equation and saw massive improvements immediately - 2000 messages were delivered in seconds.
The architecture uses transport security so we're currently trying to switch over to dialog security as we've read that transport security / forwarding can harm performance. We're hoping Dialog security will somehow optimize what needs to be decrypted by the forwarding instance therefore improving performance.
First thing Monday I want to switch off encryption on the transport layer (between initiator and forwarder) to see if that is where our bottleneck is occurring. Is it possible that this could cause a big overhead in our communications or should one forwarding instance not produce such a big bottleneck?
What SQL Server version?
There were several issues fixed with forwarding performance. I recommend you upgrade to latest SQL Server 2008 R2 and deploy latest cumulative updates. If upgrade is problematic in your environment, you can upgrade only the forwarder instance.
This might be a stupid suggestion, but have you changed the network topology lately? Maybe swapped out a network cable or overheated a switch? If this is occurring suddenly, it sounds more like a physical change than a logical one. I'd check the windows event log on both machines.
Yes, Dialog security is the best approach in conjunction with forwarders. Otherwise overhead will be enormous.
精彩评论