I've got a project where开发者_开发问答 I'm hitting a bunch of custom Windows Performance Counters on multiple servers and aggregating them into a database. If a server is down, I want to skip it, and just continue on with my day.
Currently I'm checking to see if a server is live by doing a DirectoryInfo
on a share that I've got to look at later in the process anyways, then checking the .Exists
property.This is my current code snippet for testing:
DirectoryInfo di = new DirectoryInfo(machine.Share_Path);
if (!di.Exists)
{
log.Warn("Could not access " + machine.Name + "! Maybe its down?");
continue; // Skips to the next server in my loop where this snippet exists.
}
This works, but its pretty slow. It takes about 68 seconds on average for the di.Exists bit to finish its work, and I ideally need to know within a second whether or not a server is accessible. Pinging also isn't an option since a server can be pingable but not "live" in our environment.
I'm still kind of fresh to the .NET world, so I'm open to any advice people can offer.
Thanks in advance.
-Weegee
Ping First, Ask Questions Later
Why not ping first, and then do the di.Exists if you get a response?
That would allow you to fail early in the case that is not reachable, and not waste the time for machines that are down hard.
I have, in fact, used this method successfully before.
- MSDN Ping Documentation
Paralellize
Another option you have is to paralellize the checking, and action on the servers as they are known to be available.
You could use the Paralell.ForEach()
method, and use a thread-safe queue along with a simple consumer thread to do the required action. Combined with the checking method above, this could alleviate almost all of your bottleneck on the up/down checking.
Knock on the Door
Yet another method would be to ckeck if the required remote service is running (either by hitting its port directly or by querying it with WMI).
Since WMI is almost always running when a machine is up, your connection should be very quick to either succeed or fail.
The only "quick" way I think to see if it's up without relying on ping would be to create a socket, and see if you can actually connect to the port of the service you're trying to reach.
This would be the equivalent of telnet servername 135 to see if it's up.
Specifically...
- Create a .NET TCP socket client (
System.Net.Sockets.TcpClient
) - Call BeginConnect() as an asynchronous operation, to connect to the server in question on one of the RPC ports that your directory exists code would use anyway (TCP 135, 139, or 445).
- If you don't hear back from it within X milliseconds, call
Close()
to cancel the connection.
Disclaimer: I have no idea what effect this would have on any threat/firewall protection that may see this type of Connect / Disconnect with no data sent activity as a threat.
Opening Socket to a specific port usually does the trick. If you really want it to be fast, be sure to set the NoDelay property on the new socket (Nagle algorithm) so there is no buffering.
Fast will largely depend on latency but this is probably the fastest way I know to connect to an endpoint. It's pretty simple to parallelize using the async methods. How fast you can check will largely depend on your network topology but in tests for 1000 servers (latency between 0-75ms) I've been able to get connectivity state in ~30 seconds. Not scientific data at all but should give you the idea.
Also, don't ever do this through UNC file shares because if the server no longer exists you will have a lot of dangling connections that take forever to timeout. So if you have a lot of servers with invalid DNS records and you try to poll them you will bring Windows down completely over time. Things like File.Exists and any file access will cause this.
The "Full-Blown" option would be to install a monitoring tool like SCOM (System Center Operations Manager), this has an SDK you can use to query SCOM for (performance) and maintenance information avout machines being monitored. Might be a bridge to far though....
Telnet is another option. Try telnetting to the target machine to see if it responds.
Create a small Windows Service that you install on your target machine, have the sys admin stop it when they perform maintenance on the target machine (just use batch file to net stop / net start the service)
精彩评论