I'm writing some code and need to search for some kinds of symbols in a strin开发者_开发问答g. I use mb_strpos function for this and it works for alphabet symbols but it doesn't if I search for symbols like question mark, dots and etc. For example if I search for "aaaaa" (or any other unicode character) in a string mb_strpos works as expected but if I search for "?????" it doesn't!
This is my code:
function symbols_in_row($string, $limit=5) {
//split string by characters and generate new array containing each character
$symbol = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);
//remove duplicate symbols from array
$unique = array_unique($symbol);
//generate combination of symbols and search for them in string
for($x=0; $x<=count($unique); $x++) {
//generate combination of symbols
for($c=1; $c<=$limit; $c++) {
$combination .= $unique[$x];
}
//search for this combination of symbols in given string
$pos = mb_strpos($string, $combination);
if ($pos !== false) return false;
}
return true;
}
It always returns true in second case!
Can anyone please help?
Well, may I suggest doing it in a different way?
function symbolsInRow($string, $limit = 5) {
$regex = '/(.)\1{'.($limit - 1).',}/us';
return 0 == preg_match($regex, $string);
}
So basically it just looks at any character repeated $limit
times in a row (or more). If it finds any, it returns false
. Otherwise it returns true
...
You can do it with a simple regExp:
<pre>
<?php
$str="Lorem ipsum ?????? dolor sit amet xxxxx ? consectetuer faucibus.";
preg_match_all('@(.)\1{4,}@s',$str,$out);
print_r($out);
?>
</pre>
To explain the expression:
(.)
matches every char and creates a reference to it
\1
uses this reference
{4,}
the reference has to occur 4 time or more(so with this 4 chars and the reference itself you will match 5 identical chars)
精彩评论