开发者

How to search for a part of a string in an array?

开发者 https://www.devze.com 2023-03-21 21:04 出处:网络
I want to search whether the complete string or a part of the string is a part of the array. How can this be achieved in PHP?

I want to search whether the complete string or a part of the string is a part of the array. How can this be achieved in PHP?

Also, how can I use metaphone in it as well?

Example:

array1={'India','USA','China'};
array2={'I开发者_如何学Cndia is in east','United States of America is USA','Made in China'}

If I search for array1 in array2, then:

'India' should match 'India is in east' and similarly for USA & China.


$array1 = array('India','USA','China');
$array2 = array('India is in east','United States of America is USA','Made in China');
$found = array();

foreach ($array1 as $key => $value) {
    // Thanks to @Andrea for this suggestion:
    $found[$value] = preg_grep("/$value/", $array2);
    // Alternative:
    //$found = $found + preg_grep("/$value/", $array2);
}

print_r($found);

Result:

Array
(
    [0] => India is in east
    [1] => United States of America is USA
    [2] => Made in China
)

Using Metaphone is trickier. You will have to determine what constitutes a match. One way to do that is to use the Levenshtein distance between the Methaphone results for the two values being compared.

Update: See @Andrea's solution for a more sensible per-word Metaphone comparison.

Here's a rough example:

$meta1 = array_map(
    create_function( '$v', 'return array(metaphone($v) => $v);' ),
    $array1
);

$meta2 = array_map(
    create_function( '$v', 'return array(metaphone($v) => $v);' ),
    $array2
);

$threshold = 3;

foreach ($meta2 as $key2 => $value2) {

    $k2 = key($value2);
    $v2 = $value2[$k2];

    foreach ($meta1 as $key1 => $value1) {

        $k1  = key($value1);
        $v1  = $value1[$k1];
        $lev = levenshtein($k2, $k1);

        if( strpos($v2, $v1) !== false || levenshtein($k2, $k1) <= $threshold ) {
            array_push( $found, $v2 );
        }
    }
}

...but it needs work. It produces duplicates if the threshold is too high. You may prefer to run the match in two passes. One to find simple matches, as in my first code example, and then another to match with Metaphone if the first returns no matches.


The metaphone case could also follow the same structure proposed by Mike for the strict case.

I do not think that an additional similarity function is needed, because the purpose of the metaphone should be to give us a key that is common to words that sound the same.

$array1 = array('India','USA','China');
$array2 = array(
    'Indiuh is in east',
    'United States of America is USA',
    'Gandhi was born in India',
    'Made in China'
);
$found = array();
foreach ($array1 as $key => $value) {
    $found[$value] = preg_grep('/\b'.$value.'\b/i', $array2);
}

var_export($found);

echo "\n\n";

function meta( $sentence )
{
    return implode(' ', array_map('metaphone', explode(' ', $sentence)));
}

$array2meta = array_map('meta', $array2);
foreach ($array1 as $key => $value) {
    $valuemeta = meta($value);
    $foundmeta[$value] = preg_grep('/\b'.$valuemeta.'\b/', $array2meta);
    $foundmeta[$value] = array_intersect_key($array2, $foundmeta[$value]);
}

var_export($foundmeta);

The above code prints out:

array (
  'India' => 
  array (
    2 => 'Gandhi was born in India',
  ),
  'USA' => 
  array (
    1 => 'United States of America is USA',
  ),
  'China' => 
  array (
    3 => 'Made in China',
  ),
)

array (
  'India' => 
  array (
    0 => 'Indiuh is in east',
    2 => 'Gandhi was born in India',
  ),
  'USA' => 
  array (
    1 => 'United States of America is USA',
  ),
  'China' => 
  array (
    3 => 'Made in China',
  ),
)


$a1 = array('India','USA','China');
$a2 = array('India is in east','United States of America is USA','Made in China');


foreach ( $a2 as $a )
{
  foreach( $a1 as $b  )
  {
    if ( strpos( $a, $b ) > -1 )
    {
      echo $a . " contains " . $b . "\n";
    }
  }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消