I have a PHP web crawler that just checks out websites. I decided a few days ago to make the crawlers progress show in real time using AJAX. The php script writes to a file in JSON and AJAX reads the tiny file.
I double and triple checked my PHP script wondering what the hell was going on because after I finished the simple AJAX script the data appearing on my browser leaped up and down in strange directions.
The php script executed perfectly and very quickly but my AJAX would slowly increase the values, every 2 seconds as set, then drop. The numbers only increase in PHP they do not go down. However, the numbers showing up on my webpage go up and down as if the buffer is working on multiple sessions or reading from something that is being updated even though the PHP stopped about an hour ago.
Is there something I'm missing or need to keep clear like a buffer or a reset button?
This is the most I can show, I just slapped it together a really long time ago. If you know of better code then please share, I love any help possible. But, I'm sort of new so please explain things outside of basic functions.
AJAX
//open our json file
ajaxRequest.onreadystatechange = function(){
if(ajaxRequest.readyState == 4){
//display json file开发者_如何学Go contents
document.form.total_emails.value = ajaxRequest.responseText;
}
}
ajaxRequest.open("GET", "test_results.php", true);
ajaxRequest.send(null);
PHP
//get addresses and links
for($x=(int)0; $x<=$limit; $x++){
$input = get_link_contents($link_list[0]);
array_shift($link_list);
$link_list = ($x%100==0 || $x==5)?filter_urls($link_list,$blacklist):$link_list;
//add the links to the link list and remove duplicates
if(count($link_list) <= 1000) {
preg_match_all($link_reg, $input, $new_links);
$link_list = array_merge($link_list, $new_links);
$link_list = array_unique(array_flatten($link_list));
}
//check the addresses against the blacklist before adding to a a file in JSON
$res = preg_match_all($regex, $input, $matches);
if ($res) {
foreach(array_unique($matches[0]) as $address) {
if(!strpos_arr($address,$blacklist)){
$enum++;
json_file($results_file,$link_list[0],$enum,$x);
write_addresses_to_file($address, $address_file);
}
}
}
unset($input, $res, $efile);
}
The symptoms might indicate the PHP script not closing the file properly after writing, and/or a race condition where the AJAX routine is fetching the JSON data in between the PHP's fopen() and the new data being written.
A possible solution would be for the PHP script to write to a temp file, then rename to the desired filename after the data is written and the file is properly closed.
Also, it's a good idea to check response.status == 200 as well as response.readyState == 4.
Tools like ngrep and tcpdump can help debugging this type of problem.
精彩评论