开发者

Regex in javascript working with Cyrillic (Russian) set

开发者 https://www.devze.com 2022-12-15 08:38 出处:网络
Is it possible to work with Russia开发者_运维百科n characters, in javascript\'s regex? Maybe the use of \\p{Cyrillic}?

Is it possible to work with Russia开发者_运维百科n characters, in javascript's regex?

Maybe the use of \p{Cyrillic}?

If yes, please provide a basic example of usage.

The example:

var str1 = "абв прв фву";
var regexp = new RegExp("[вф]\\b", "g");

 alert(str1.replace(regexp, "X"));

I expect to get: абX прX


Here is a good article on JavaScript regular expressions and unicode. Strings in JavaScript are 16 bit, so strings and RegExp objects can contain unicode characters, but most of the special characters like '\b', '\d', '\w' only support ascii. So your regular expression does not work as expected due to the use of '\b'. It seems you'll have to find a different way to detect word boundaries.


It should work if you just save the JavaScript file in UTF8. Then you should be able to enter any character in a string.

edit: Just made a quick example with some cryllic characters from Wikipedia:

var cryllic = 'абвгдеёжзийклмнопрстуфхцчшщъыьэюяабвгдеёжзийклмнопрстуфхцчшщъыьэюя';
cryllic.match( 'л.+а' )[0];
// returns as expected: "лмнопрстуфхцчшщъыьэюяа"


According to this:

JavaScript, which does not offer any Unicode support through its RegExp class, does support \uFFFF for matching a single Unicode code point as part of its string syntax.

so you can at least use code points, but seemingly nothing more (no classes).

Also check out this duplicate of your question.

0

精彩评论

暂无评论...
验证码 换一张
取 消