开发者

Detect chinese character using perl?

开发者 https://www.devze.com 2023-03-24 23:02 出处:网络
Is there any way to detect Chinese characters using Perl? And is there any way on how to split Chinese characters开发者_JAVA百科 with symbol dot \'.\' perfectly?Depends on your particular notion of wh

Is there any way to detect Chinese characters using Perl? And is there any way on how to split Chinese characters开发者_JAVA百科 with symbol dot '.' perfectly?


Depends on your particular notion of what is a Chinese character. Perhaps you're looking for /\p{Script=Hani}/, but if we want to cast our net wide, the following regex pattern will match stuff that occurs in Chinese writing. Restrict if necessary.

use 5.014;
/
    (?: \p{Block=CJK_Compatibility}
    |   \p{Block=CJK_Compatibility_Forms}
    |   \p{Block=CJK_Compatibility_Ideographs}
    |   \p{Block=CJK_Compatibility_Ideographs_Supplement}
    |   \p{Block=CJK_Radicals_Supplement}
    |   \p{Block=CJK_Strokes}
    |   \p{Block=CJK_Symbols_And_Punctuation}
    |   \p{Block=CJK_Unified_Ideographs}
    |   \p{Block=CJK_Unified_Ideographs_Extension_A}
    |   \p{Block=CJK_Unified_Ideographs_Extension_B}
    |   \p{Block=CJK_Unified_Ideographs_Extension_C}
    )
/x;

Yes, . matches one character. The empty pattern for split DWYM:

use utf8;
split //, '冰淇淋'
# returns ('冰', '淇', '淋')
0

精彩评论

暂无评论...
验证码 换一张
取 消