开发者

search for symbols in a string

开发者 https://www.devze.com 2023-01-25 03:05 出处:网络
I\'m writing some code and need to search for some kinds of symbols in a strin开发者_开发问答g. I use mb_strpos function for this and it works for alphabet symbols but it doesn\'t if I search for symb

I'm writing some code and need to search for some kinds of symbols in a strin开发者_开发问答g. I use mb_strpos function for this and it works for alphabet symbols but it doesn't if I search for symbols like question mark, dots and etc. For example if I search for "aaaaa" (or any other unicode character) in a string mb_strpos works as expected but if I search for "?????" it doesn't!

This is my code:

function symbols_in_row($string, $limit=5) {
    //split string by characters and generate new array containing each character
    $symbol = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);
    //remove duplicate symbols from array
    $unique = array_unique($symbol);
    //generate combination of symbols and search for them in string
    for($x=0; $x<=count($unique); $x++) {
        //generate combination of symbols
        for($c=1; $c<=$limit; $c++) {
            $combination .= $unique[$x];
        }
        //search for this combination of symbols in given string
        $pos = mb_strpos($string, $combination);
        if ($pos !== false) return false;
    }
    return true;
}

It always returns true in second case!

Can anyone please help?


Well, may I suggest doing it in a different way?

function symbolsInRow($string, $limit = 5) {
    $regex = '/(.)\1{'.($limit - 1).',}/us';
    return 0 == preg_match($regex, $string);
}

So basically it just looks at any character repeated $limit times in a row (or more). If it finds any, it returns false. Otherwise it returns true...


You can do it with a simple regExp:

<pre>
<?php 

$str="Lorem ipsum ?????? dolor sit amet xxxxx ? consectetuer faucibus.";
preg_match_all('@(.)\1{4,}@s',$str,$out);
print_r($out);
?>
</pre>

To explain the expression:

(.) matches every char and creates a reference to it
\1 uses this reference
{4,} the reference has to occur 4 time or more(so with this 4 chars and the reference itself you will match 5 identical chars)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号