\"Résumé\" # encoding开发者_运维百科: utf-8 ARGV.each do |argument|" />
开发者

Ruby: ARGV breaks accented characters

开发者 https://www.devze.com 2023-04-03 02:53 出处:网络
# encoding: utf-8 foo = \"Résumé\" p foo > \"Résumé\" # encoding开发者_运维百科: utf-8 ARGV.each do |argument|
# encoding: utf-8
foo = "Résumé"
p foo

> "Résumé"

# encoding开发者_运维百科: utf-8
ARGV.each do |argument|
    p argument
end

test.rb Résumé > "R\xE9sum\xE9"

Why does this occur, and how can I get ARGV to return "Résumé"?

I have chcp 65001 set already and am using ruby 1.9.2p290 (2011-07-09) [i386-mingw32]

EDIT After asking around on irc, I was instructed to do chcp 1252>NUL which fixed the problem.


For some reason, Windows doesn't use UTF-8 in your console. So, although Ruby expects UTF-8 encoded string, it gets Windows-1252 encoded string.

So you have several possibilities (which I can't test as I, fortunately, don't use Windows):

  1. Persuade Windows to use UTF-8 in your console. I don't know if chcp should work and, if so, why it doesn't.
  2. Tell Ruby to use Windows-1252 instead of UTF-8 as default
  3. Convert ARGV from Windows-1252 to UTF-8 manually:

Example:

>> argument = "R\xE9sum\xE9"
=> "R\xE9sum\xE9"
>> argument.force_encoding('windows-1252').encode('utf-8')
=> "Résumé"
0

精彩评论

暂无评论...
验证码 换一张
取 消