开发者

Efficient algorithm for detecting matches

开发者 https://www.devze.com 2022-12-17 12:59 出处:网络
I\'m looking for an efficient algorithm f开发者_开发百科or detecting equal values in an array of integers N size. It must return the indices of the matches.

I'm looking for an efficient algorithm f开发者_开发百科or detecting equal values in an array of integers N size. It must return the indices of the matches.

Alas, I can't think of anything more clever then brute force with two loops.

Any help will be appreciated. Thanks!


You could intersect the array. This finds all the values of array2 that are in array1

$array1 = array("a" => "green", "b" => "brown", "c" => "blue", "red");
$array2 = array("a" => "green", "yellow", "red");
$result_array = array_intersect_assoc($array1, $array2);
print_r($result_array);

Would return

Array
(
    [a] => green
)

It returns an array with all of the keys and values of the matches. Basically you can provide an infinite number of arguments to the array_insert_assoc:

array_intersect_assoc($base_array, $arr1, $arr2 ...);

It will search $base_array for the values that are in all the subsequent arrays. That means that the key and value will be taken from the $base_array

You could also compare the keys by using:

array_intersect_keys($base_array, $arr1, $arr2, $arr3);


These loops are O(N^2). Is N big? If so, can you sort the array O(NlogN), then scan it O(N)? ... or am I missing something?


You can use a set to hold the recent values. For example,

results = empty list
set = empty set
foreach key, val in array:
   if val is not in set: add val to set
   else: add key to results
return results

Each look up of set is O(1), so this algo will results in O(n) instead of O(n^2) if nested-loop is used.

In case you want to keep track of multi-occurence like this array 1, 2, 3, 3, 2, 1 you can use a hash table with key is the value and value (of the corresponding key in table) is the list of indices. The result for the given array will look lik {1:0, 5; 2: 1, 4; 3: 2, 3}.

results = empty hashtable
for each key, val in array:
    if val is not in results:
        results[val] = new list()
    results[val].append(key)
return results


Perhaps this?

$arr = array_map('unserialize', array_unique(array_map('serialize', $arr)));

From the question: How to remove duplicated 2-dimension array in PHP?

if ($arr !== array_map('unserialize', array_unique(array_map('serialize', $arr))))
{
    // found duplicates
}


You don't have to go through all the array again for each element. Only test an element with the subsequent element in the array:

$array = /* huge array */;
$size = count($array);
for($i = 0; $i < $size; $i++)
{
    for($j = $i + 1; $j < $size; $j++) // only test with the elements after $i
    {
        if($array[$i] == $array[$j])
            return true; // found a duplicate
    }
    return false; // found no duplicate
}

That's the most efficient way I can think of. Adapt it to your need as you will.


If one of your arrays is reasonably static (that is you are comparing to the same array several times ) you could invert it.

That is set up another array which is keyed by value and returns the index into the real array.

$invert = array();
foreach ($cmptoarray as $ix => $ival) {
   $invert[$ival] = $ix;
}

Then you simply need an if ( isset($invert[$compfrmarray[$i]) ) .... to check the number.

Note: this is only worth doing if you compare against the same array several times!


Just use an associative array mapping a value to its index:

foreach($array1 as $index => $value) {
    $aa[$value] = $index;
}

foreach($array2 as $index => $value) {
    if(isset($aa[$value])) {
        echo 'Duplicate:  .  Index 1:  '.$aa[$value].'  Index 2:  '.$index.'.';
    }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消