开发者

Does Ruby share PHP's multibyte string problem?

开发者 https://www.devze.com 2022-12-15 10:03 出处:网络
PHP has a lot of trouble with multibyte strings (non-ASCII characters). The entire language was built assuming that each character is a byte. To solve this they invented the mb_strings functions which

PHP has a lot of trouble with multibyte strings (non-ASCII characters). The entire language was built assuming that each character is a byte. To solve this they invented the mb_strings functions which you can use instead of the standard functions (which work fine).

strlen($str);
mb_strlen($str); // correct

However, this is really a pain since you have to verify that the code you download/find online uses these functions or enable the mb_string_overload which then might break some code that actually needs char = byte calculatio开发者_Go百科ns.

Does Ruby share this problem?


It shares the problem. It's covered here at SO. You can use ActiveSupport::Multibyte for mb_chars support.

>> s =  "Iñtërnâtiônàlizætiøn"
=> "Iñtërnâtiônàlizætiøn"
>> puts s[0..3]
Iñt
=> nil
>> puts s.mb_chars[0..3]
Iñtë
=> nil
>> puts s.mb_chars.size
20
=> nil
>> puts s.size
27
=> nil


I think Ruby 1.9 clears up this underlaying assumption


irb(main):002:0> 'ÿ'.length
=> 2
0

精彩评论

暂无评论...
验证码 换一张
取 消