开发者

Anyone understands this PHP function,why it's guranteed to output chinese characters only?

开发者 https://www.devze.com 2023-01-21 11:31 出处:网络
function getChnRandChar($length) { mt_srand((double)microtime() * 1000000); $hanzi = \'\'; for ($i = 0; $i < $length; $i++) {
function getChnRandChar($length) {
    mt_srand((double)microtime() * 1000000);
    $hanzi = '';
    for ($i = 0; $i < $length; $i++) {
        $number = mt_rand(16, 56) * 100 + mt_rand(1, 19);
        $tmpHanzi = chr(mb开发者_JAVA技巧_substr($number, 0, 2) + 160);
        $tmpHanzi .= chr(mb_substr($number, 2, 2) + 160);
        $hanzi .= mb_convert_encoding($tmpHanzi, 'utf8', 'gb2312');
    }
    return $hanzi;
}


It first generates a random GB2312 character and then converts it to UTF-8.

The character it generates is in the 16th to 56th row and 1st to 19th column of the 94x94 grid, so it only includes a small subset of Chinese characters, and excludes all non-Chinese characters in the GB 2312 character set.

It first generates a random number in the ranges 1601-1619, 1701-1719, ... 5601-5619, which are all GB2312 codepoints. The second and third lines of the for loop then encode the code point as a two-byte EUC-CN sequence:

To map the code points to bytes, add 160 (0xA0) to the 1000's and 100's value of the code point to form the high byte, and add 160 (0xA0) to the 10's and 1's value of the code point to form the low byte.

The last line then converts the 2-byte EUC-CN encoded character to a UTF-8 character.


Because the charater encoding is GB2312.

GB2312 is the registered internet name for a key official character set of the People's Republic of China, used for simplified Chinese characters.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号