I have a PHP web app which scrapes a search engine for a given keyword.
Currently the app is looping through an array of keywords running the scrape function one keyword at a time.
This is OK at the moment because the number of keywords is fairly small, but it won't scale well.
I think the best way to go will be to select a smaller set of keywords from the mysql db using limit, and then run the scrape function concurrently ag开发者_StackOverflow社区ainst the entire array. Once that set has finished, I'll move on to the next set.
But I'm stuck with how to run the function concurrently against the array.
How would you handle this?
There isn't any concurrency in PHP itself, but if you get your search result with cURL there is a multiple-request feature in the cURL extension, so you could parallelize at least the fetching of the results.
<?php
$fruits = array("d" => "lemon", "a" => "orange", "b" => "banana", "c" => "apple");
function test_alter(&$item1, $key, $prefix)
{
$item1 = "$prefix: $item1";
}
function test_print($item2, $key)
{
echo "$key. $item2<br />\n";
}
echo "Before ...:\n";
array_walk($fruits, 'test_print');
array_walk($fruits, 'test_alter', 'fruit');
echo "... and after:\n";
array_walk($fruits, 'test_print');
?>
精彩评论