I created an ordinary text file on Windows 7 64-bit using gnu emacs 23.3.1. I can edit the file with other programs such as LinqPad (the file happens to be a linqpad script, extension .linq). Everything is fine until I put a Unicode character in the file, a character such as the greek letter λ (lambda). I can input the letter in emacs and it displays开发者_JAVA技巧 correctly. However, emacs refuses to save the file, reporting the following error
Failure in loading charset map: 8859-7
If I input the λ in LinqPad, emacs will read and display them, but will not save the file.
I just noticed that Notepad++ has other unexpected behavior with this file: it does not display the λ's, but instead pairs of odd characters such as λ. That is fitting to an untuition (pun intended) that the unicode chars are being stored as pairs. So it looks like this is a kind of ambiguous situation (storing unicode in text files), but it also looks like linqPad and visual studio "do the obvious thing."
I want to use emacs because it's the only program that I have that reflows sequences of commented lines (lines after //, reflows them with Alt-Q), and I want to use greek characters in my comments because I'm describing a mathematical program.
I'll be grateful for advice and answers.
UPDATE: some advice in other questions said to try M-x describe-char, also bound to C-x = ; both of those give me the same failure message as above, so they're on the right track, just not answers.
This once happened to me when I had upgraded all packages (including Emacs) without realising I still had an Emacs session open during the upgrade. Next time I asked it to save some Unicode, it tried to load 8859-7 and failed because the path was different in the upgraded version. I had to redo the edit after restarting Emacs.
I just noticed that Notepad++ has other unexpected behavior with this file: it does not display the
λ
s, but instead pairs of odd characters such asλ
.
λ
is what you get when you interpret the byte sequence 0xCE, 0xBB using the encoding ISO-8859-1, or Windows code page 1252 (Western European). Code page 1252 is probably the default (‘ANSI’) code page on your machine.
0xCE, 0xBB is the UTF-8 encoding of the character λ
(U+03BB Greek small letter lambda). So to display it correctly you need to tell your text editor that the file is saved in UTF-8 and not ANSI.
In Notepad++, choose UTF-8 from the menu bar ‘Encoding’ entry.
In Emacs, C-x C-m c utf-8-dos
(or unix
or whatever) as a prefix to opening or saving the file. Hopefully by saving in UTF-8 you'll avoid whatever the problem is with the ISO 8859-7 (Greek) map; you certainly don't want to be saving any files in 8859-7, or indeed anything but UTF-8, if you can help it.
精彩评论