Could you please point me to the mistake in my regular expression?
/[\x{4e00}-\x{9fa5}]*[.\s]*\[\/m\][\x{4e00}-\x{9fa5}]/u
My string starts with开发者_JAVA百科 a Chinese character ([\x{4e00}-\x{9fa5}]
), followed by any character and ends with '[/m]' and another Chinese character. So the string possibly could look like:
我... some text goes here (contains any characters including spaces and new lines)... [/m]我
But unfortunately my regular expression doesn't work as expected.
Matching Chinese characters with regular expressions (php)
<?php
# this is our regx /\p{Han}+/u
$string='我... some text goes here (contains any characters including spaces and new lines)... [/m]我';
if(preg_match("/\p{Han}+/u", $string)){
echo "chinese here";
}
if(preg_match("/\p{Han}+/u", $string)){
#get all chinese characters in one array
preg_match_all('/\p{Han}+/u',$string,$matches);
print_R($matches[0]);
}
?>
chinese here
Array (
[0] => Array
(
[0] => 我
[1] => 我
)
)
You can do a foreach and replace your desired characters .
It looks like you probably want to replace the first '*' with '+' to ensure you have at least one matching character in the initial spot and you can drop the character group with '\s' and just use '.' as that will match any character. Also, if this is to be a complete line I would start the regex with '^' and end it with '$'.
- If there should only be one Chinese character at the beginning, drop the first '*'.
- However you should keep the '[.\s]', because '.' doesn't match newlines (I think).
- Once that's done, make sure the problem comes from the regexp and not from the php code.
/[\x{4e00}-\x{9fa5}][.\s]*\[\/m\][\x{4e00}-\x{9fa5}]/um
[\x{4e00}-\x{9fa5}]+.+\[\/m\][\x{4e00}-\x{9fa5}]
Which matches your description:
[\x{4e00}-\x{9fa5}]+
--> One or more chars between 4E00 and 9FA5.
.+
--> One or more other chars
\[\/m\]
--> [/m]
[\x{4e00}-\x{9fa5}]
--> One char between 4E00 and 9FA5
精彩评论