开发者

Unicode working in PHP

开发者 https://www.devze.com 2023-03-18 11:12 出处:网络
Can someone explain why all this code works normally if PHP is only supposed to support a 256-char开发者_JAVA技巧acter set?

Can someone explain why all this code works normally if PHP is only supposed to support a 256-char开发者_JAVA技巧acter set?

I know that Content-Type tag interpret these characters if is on UTF-8. But why PHP work it?

echo "匝";

if (preg_match('/啊/', "啊"))
    echo "Match";

if (preg_match('/\w/', "啊"))
    echo "Match";


Compare your code to:

if (preg_match('/^\w$/', "啊"))
    echo "Match";

regex /\w/ works because your multibyte char contains of 2 bytes: 0x53 and 0x1D. And first one, 0x53 looks like a valid single-byte char S

PS: this is valid way to match one multibyte letter:

var_dump(preg_match('/^\p{L}$/u', "匝", $matches));


Most likely that your PCRE has been compiled with Unicode support enabled (--enable-utf8 --enable-unicode-properties) which would cause preg_match() to match unicode characters.

0

精彩评论

暂无评论...
验证码 换一张
取 消