开发者

Remove almost duplicate values from an array in PHP [closed]

开发者 https://www.devze.com 2023-03-28 01:43 出处:网络
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references,or expertise, but this question will likely solicit debate, a
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 11 years ago.

Need help!

i haven an array where the values are duplicated but not entirely,

somestring = 'abcd-abcd-123', someOTHERstring223 = 'abcsd--adsf_12ds'

Array
        (
            [0] => somestring
            [1] => somestring-(don't know the delimiter)core
            [2] => somestring_(don't know the delimiter)-(don't know the delimiter)somethingelse
            [3] => someOTHERstring223
            [4] => someOTHERstring223_junkstring
            [5] => someOTHERstring223OTHERSTRING-somethingNEW

)

and the result i want it would be

somestring
someOTHERstring223

i just want the shortest values, cause somestring, somestring-(don't know the delimiter)core, somestring_(don't开发者_C百科 know the delimiter)-(don't know the delimiter)somethingelse are the same because they all start with somestring.

sorry everybody, i didn't asked the correct question.

i came up with the answer but i don't know if it the most efficient,

$coLL = array('somestring',"somestring-(don't know the delimiter)core","somestring_(don't know the delimiter)-(don't know the delimiter)somethingelse"
        ,"someOTHERstring223",'someOTHERstring223_junkstring','someOTHERstring223OTHERSTRING-somethingNEW');
    $coLL2 = $coLL;
    foreach($coLL as $coLLK=>$coLLV){  
        $flength = strlen($coLLV);
        foreach($coLL2 as $coLL2K=>$coLL2V){            
            if(strcmp($coLLV, $coLL2V) < 0){
                if(strlen($coLL2V)-$flength > 3){                    
                    unset($coLL2[$coLL2K]);
                }
            }        
        }        
    }

i set this limiter if(strlen($coLL2V)-$flength > 3) because what if somestring1 comes up or somestring12 or somestring123 they are unique and they not match somestring.

Thanks everybody for your answers.


This should do it:

<?php

    $array = array('apple','apple-core','apple-core-something','orange','orange-core','orange-core-someting');
    $result = array();
    foreach ($array as $entry) {
        $entry = explode('-',$entry);
        if (!in_array($entry[0],$result)) {
            $result[] = $entry[0];
        }
    }

    print_r($result);

?>

Working Example


The other answers all assume that - or some other token can delimit your shortest string. To do what you want without any delimiters, you could use something like this code:

$yourArray = Array(
    0 => "apple",
    1 => "apple-core",
    2 => "apple-core-something",
    3 => "orange",
    4 => "orange-dot",
    5 => "orange-dot-something",
) ;
$resultArray = Array() ;

foreach($yourArray as $test) {
    if(strlen($test)==0) continue(1) ;        // Drop 0 length items.
    foreach($resultArray as $rkey => $rval) {
        if(strpos($test, $rval)===0) {        // If $test starts with $rval
            continue(2) ;                     // Continue outer foreach
        } elseif(strpos($rval, $test)===0) {  // If $rval starts with $test
            unset($resultArray[$rkey]) ;      // No longer shortest unique
            continue(1) ;                     // Continue inner foreach (and add $test)
        }
    }
    $resultArray[] = $test ;
}

var_dump($resultArray) ;
// array(2) {
//   [0]=>
//   string(5) "apple"
//   [1]=>
//   string(6) "orange"
// }


    $store = array();

    foreach($data as $fruit) $store[] = array_shift(explode('-',$fruit));



print_r($store);

here $data is the array you have posted above


To solve your problem divide it:

  1. Normalize each value to only contain the value you want to look for extact duplicates (strtokDocs).
  2. Remove duplicates from the array (array_uniqueDocs).

Demo:

function normalize($v)
{
   return strtok($v, '-_');
}

$normalized = array_map('normalize', $data);

$unique = array_unique($normalized);

Result:

array(3) {
  [0]=>
  string(10) "somestring"
  [3]=>
  string(18) "someOTHERstring223"
  [5]=>
  string(29) "someOTHERstring223OTHERSTRING"
}

You actually build a hash for each entry in the list. The hash is representing the comparison value of the original value. Then unique the hashes (and you actually want only the hashes).

What you need is a hash function that fulfills your needs. In the example above, the hash function is normalize.

If the outcome does not suit your needs, you need to adopt the hash function. I had chosen strtok as it seemed suitable for your (original) case. However if looking for a delimiter get's more complicated, you might look for regular expressions to specifiy a delimiter, like preg_splitDocs or preg_replaceDocs.

However to make use of a regular expression, you must know what your delimiter is, because bascially you follow the strategy to pad a string to build the hash. Without a well specified delimiter there is only try an error.


foreach($a as $k=>$v) {
    foreach($a as $k2=>$v2) {
        if ($k2 == $k)
            break;
        if ($v == substr($v2, 0, strlen($v))) {
            unset($a[$k2]);
            break;
        }
        if ($v2 == substr($v, 0, strlen($v2))) {
            unset($a[$k]);
            break;
        }
    }
}

Note: my solution just drops the elements for which there is an element in the array which is an exact prefix of the element. your updated question doesn't have a solution since you have to know what the delimiters are.

0

精彩评论

暂无评论...
验证码 换一张
取 消