I am trying to build a system of monitoring site / server uptime in PHP, the system will be required to check thousands of domains / ips a minute. I have looked into cURL as this seems to be the best method.
Edit: The system will be required to probe a server, check its response time is reasonable, and return its response code. It w开发者_开发技巧ill then add a row to a mysql db containing response time and status code. The notification part of the system is fairly straight forward from there. The system will be on dedicated servers. Hope this adds some clarity.
Why not go for the KISS approach and use php's get_headers() function?
If you want to retrieve the status code, here's a snippet from the comments to the php manual page:
function get_http_response_code($theURL) {
$headers = get_headers($theURL);
return substr($headers[0], 9, 3);
}
This function (being a core php feature) should be faster and more efficient than curl.
If I understand correctly, this system of yours will constantly be connecting to thousands of domains/ips, and if it works, it assumes that the server is up and running?
I suppose you could use cURL, but it would take a long time especially if you're talking thousands of requests - you'd need multiple servers and lots of bandwidth for this to work properly.
You can also take a look at multi cURL for multi-threaded requests (ie. simultaneously sending out 10+ cURL requests, instead of one at a time).
http://php.net/manual/en/function.curl-multi-exec.php
There are very good tools for things like that. No need to write it on your own.
Have a look at Nagios for example. Many admins use it for monitoring.
Your bottleneck will be in waiting for a given host to respond. Given a 30 second timeout and N hosts to check and all but the last host not responding, you'll need to wait 30(N-1) seconds to check the last host. You may never get to checking the last host.
You certainly need to send multiple HTTP requests - either multi cURL as already suggested, or the HttpRequestPool class for an OO approach.
You will also need to consider how to break the set of N hosts to check down into a maximum number of subsets to avoid the problem of failing to reach the a host due to having to first deal with a queue of non-responding hosts.
Checking N hosts from 1 server presents the greatest chance of not reaching one or more hosts due to a queue of non-responding hosts. This is the cheapest, most easy and least reliable option.
Checking 1 host each from N servers presents the least chance of not reaching one not reaching one or more hosts due to a queue of non-responding hosts. This is the most expensive, (possibly) most difficult and most reliable option.
Consider a cost/difficulty/reliability balance that works best for you.
精彩评论