DCOM: How to close connection in server on client crash?_问答_开发者

I have a rather old project: DCOM client and server, both in C++\ATL, only Windows platform. Everything works fine: local and remote clients connect to server and work simultaneously without any problem.

But when remote client crashes or being killed by Task Manager or by "taskkill" command or power switch off - I have a problem. My server do not know anything about client crash and tries to send new events to all clients (also to already crashed). As result I have pause (server can not send data to already crashed client) and it's duration is proportional to the numbers of crashed remote clients. After 5 crashed clients pauses are so long that it is equal to completely server stop.

I know about DCOM "ping" mechanism (DCOM should disconnect clients that does not respond to "every 2 minutes ping" after 6 minutes of silence). And really, after 6 minutes of hang I have small period of normal work but then server is coming back to "paused" state.

What can I do with all of this? How to make DCOM "ping" works fine? If I will implement my own "ping" code is it p开发者_Go百科ossible to disconnect old DCOM clients connection manually? How to do it?

I'm not sure about the DCOM ping system, but one option for you would be to simply farm off the notifications to a separate thread pool. This will help mitigate the effect of having a small number of blocking clients - you'll start having problems when there are too many though, of course.

The easy way to do this is to use QueueUserWorkItem - this will invoke the passed callback on the application's system thread pool. Assuming you're using a MTA, this is all you need to do:

static InfoStruct {
    IRemoteHost *pRemote;
    BSTR someData;
};

static DWORD WINAPI InvokeClientAsync(LPVOID lpInfo) {
  CoInitializeEx(COINIT_MULTITHREADED);

  InfoStruct *is = (InfoStruct *)lpInfo;
  is->pRemote->notify(someData);
  is->pRemote->Release();
  SysFreeString(is->someData);
  delete is;

  CoUninitialize();
  return 0;
}

void InvokeClient(IRemoteHost *pRemote, BSTR someData) {

  InfoStruct *is = new InfoStruct;
  is->pRemote = pRemote;
  pRemote->AddRef();

  is->someData = SysAllocString(someData);
  QueueUserWorkItem(InvokeClientAsync, (LPVOID)is, WT_EXECUTELONGFUNCTION);
}

If your main thread is in a STA, this is only slightly more complex; you just have to use CoMarshalInterThreadInterfaceInStream and CoGetInterfaceAndReleaseStream to pass the interface pointer between apartments:

static InfoStruct {
    IStream *pMarshalledRemote;
    BSTR someData;
};

static DWORD WINAPI InvokeClientAsync(LPVOID lpInfo) {
  CoInitializeEx(COINIT_MULTITHREADED); // can be STA as well

  InfoStruct *is = (InfoStruct *)lpInfo;
  IRemoteHost *pRemote;
  CoGetInterfaceAndReleaseStream(is->pMarshalledRemote, __uuidof(IRemoteHost), (LPVOID *)&pRemote);

  pRemote->notify(someData);
  pRemote->Release();
  SysFreeString(is->someData);
  delete is;

  CoUninitialize();

  return 0;
}

void InvokeClient(IRemoteHost *pRemote, BSTR someData) {
  InfoStruct *is = new InfoStruct;
  CoMarshalInterThreadInterfaceInStream(__uuidof(IRemoteHost), pRemote, &is->pMarshalledRemote);

  is->someData = SysAllocString(someData);
  QueueUserWorkItem(InvokeClientAsync, (LPVOID)is, WT_EXECUTELONGFUNCTION);
}

Note that error checking has been elided for clarity - you will of course want to error check all calls - in particular, you want to be checking for RPC_S_SERVER_UNAVAILABLE and other such network errors, and remove the offending clients.

Some more sophisticated variations you may want to consider include ensuring only one request is in-flight per client at a time (thus further reducing the impact of a stuck client) and caching the marshalled interface pointer in the MTA (if your main thread is a STA) - since I believe CoMarshalInterThreadInterfaceInStream may perform network requests, you'd ideally want to take care of it ahead of time when you know the client is connected, rather than risking blocking on your main thread.

One solution would be to eliminate events - make clients query the server for whether there's anything of interest.

Use DCOM to establish a notification named pipe. Disconnection is handled better with pipes. Listener responds (almost) instantly to messages. e.g. Server->Client (what is your pipe's name?). Client->Server responds with name which includes machine. Client creates named pipe and listens. Server opens pipe either immediately or when needed.

You can implement your own ping mechanism so your clients will call server's ping method from time to time. You already maintain some sort of container for your clients on the server side. In that map mark each client with a timestamp of last ping. Then check if the client is alive before sending events to that client. You can customize a strategy of when to stop sending events, maybe based on time or number of missed pings or type of event or some other factors. You probably don't need to worry about deleting clients - that can wait till DCOM realizes that a particular client is dead. This scheme may not eliminate the issue completely since a client may die just before an event needs to be sent, but you will have complete control over how many such clients may exist by tweaking the ping period. The smaller this period the fewer dead clients although you pay with traffic.