I currently hav开发者_开发知识库e an IRC bot written in C++ which monitors a page written in php for changes and then outputs these changes to the IRC channel. However the current method is rather in-effective as it just constantly polls the page once every 10 seconds and compares it to the last seen version to check if anything has changed. I can decrease the page check interval to about 2-3 seconds before the IRC bot starts to take a performance hit, however this isn't ideal. Often the page I am monitoring can change multiple times within the 10 second period, so a change could be missed, what would be a better method to get the data from the page? considering I control both the page written in PHP, and the IRC bot, but they are on different servers.
The sole purpose of this page is to pass data to the IRC bot, so it could be completely re-implemented as something else if that would be a better solution; the IRC bot also monitors multiple versions of this page to check for different things.
If the data generated by PHP isn't somehow pushed on a stream (broadcast or feed), you don't have any other choice than polling the page, unfortunately.
What you could do is push the data from PHP using broadcast, or make a persistent connection from the bot to the PHP script, or make the PHP calculate the differences itself.
The PHP script should send a message to a public port or path that your IRB bot listens on, containing information about any posts made. This way, you are notified only when a message arrives.
One note about doing these sorts of things, beware if there are a lot of posts within a short period; if concurrency is important, you'll want to implement this using a proper MQ service like 0MQ/RabbitMQ/InsertMQFrameworkNameHere to ensure the messages arrive in order and are guaranteed sending and receiving.
If you need to monitor every change, then have your PHP page "push" data to your bot rather than your IRC bot "pull" data from the page (through polling). This can be done over any network socket, even something like a HTTP POST request from your PHP page to your bot over port 80.
A good alternative to polling is Comet. Here are examples (for JavaScript though): http://www.zeitoun.net/articles/comet_and_php/start.
I would suggest this approach:
when you retrieve your page, specify a very long timeout, say 10 minutes (bear with me for a moment);
if you have a new page, let the server return it; otherwise just don't send a reply
if there is no page, the client will wait for up to 10 minutes before giving up (timing out); but, if during this time a new page is there, your server can reply to the request and pass the page to the client;
in case the timeout fires, you simply send another request with the same long timeout.
Hope I could explain it clearly. The only tricky point is how your web page (PHP) can hold the wait when a request arrives if there is no new data to send back. This can be easily accomplished like this:
if ($newDataAvailable) {
file_put_contents($data, $request_uri);
while (!$newDataAvailable) {
$newDataAvailable = <check_for_data>;
//-- here data is available
<build response using get_file_contents($uri)>
<send response>