Can someone explain why all this code works normally if PHP is only supposed to support a 256-char开发者_JAVA技巧acter set?
I know that Content-Type tag interpret these characters if is on UTF-8. But why PHP work it?
echo "匝";
if (preg_match('/啊/', "啊"))
echo "Match";
if (preg_match('/\w/', "啊"))
echo "Match";
Compare your code to:
if (preg_match('/^\w$/', "啊"))
echo "Match";
regex /\w/
works because your multibyte char contains of 2 bytes: 0x53 and 0x1D. And first one, 0x53 looks like a valid single-byte char S
PS: this is valid way to match one multibyte letter:
var_dump(preg_match('/^\p{L}$/u', "匝", $matches));
Most likely that your PCRE has been compiled with Unicode support enabled (--enable-utf8 --enable-unicode-properties
) which would cause preg_match() to match unicode characters.
精彩评论