A recent problem* left me wondering whether there is a text editor out there that lets you see every single character of the file, even if they are invisible? Specifically, I'm not looking for hex editing capabilities, I am interested in a text editor that'll show me all of the invisible characters (not just the common whitespace / line break characters). The BOM marker is just one example, others are e.g. mathematical invisibles or possibly unsupported characters.
I'm not looking for a text editor that simply supports a large variety of text encoding / translations between encodings. All text editors I've come across treat the invisible characters correctly i.e. leave them invisible (or simply get removed in the translation as in the case of the BOM marker).
I'm asking this mostly out of academic interests, so I'm not particular about any specific OS. I can easily test Linux and OSX solutions, but if you recommend a Windows editor, I would appreciate if you include descriptions of how the editor handles invisibles other than whitespace / line breaks.
EDIT: I'm beginning to be sure that the behavior I want can be implemented in emacs/vim via either custom highlighting or by messing around with the font itself. A solution of thi开发者_StackOverflow中文版s type would also be acceptable.
EDIT2: After looking at several options I found TextMate which at least shows a blank space where an invisible UTF-8 character is in the file. Slightly disappointed with SO's ability to answer my question. Bounty goes to VIM, because that is the direction in which the solution most likely lies.
*The incident that lead me to this question: I wrote a perl script using TextWrangler and managed to change the encoding to UTF8 BOM, which inserts the BOM marker at the start of the file. Perl (or rather the operating system) promptly misses the #! and mayhem ensues. It then took me the better part of an afternoon to figure this out since most text editors do not show the BOM marker even with various "show invisibles" options turned on. Now I've learned my lesson and will use less
immediately :-).
vim (in either textual or graphic mode) can show all control characters if you :set list
. The BOM is a special case, controlled by the :set bomb
or :set nobomb
commands.
In Visual Studio's Open File dialog, the Open pushbutton has a down arrow next to it that lets you choose Open With.... One of the options in the resulting dialog is Binary Editor.
I've used this now and then to spot some invisible character or to resolve some line-ending issue.
Notepad++ rocks:
Open the file in EMACS and do a M-X hexl-mode. You'll get a display that looks like this:
87654321 0011 2233 4455 6677 8899 aabb ccdd eeff 0123456789abcdef 00000000: 2320 2020 2020 2020 2020 2020 2020 2020 # 00000010: 2020 2020 2020 2020 2020 2020 2020 2020 00000020: 2020 2020 2020 2020 2020 2020 2020 2020 00000030: 2d2a 2d20 4175 746f 636f 6e66 202d 2a2d -*- Autoconf -*- 00000040: 0a23 2050 726f 6365 7373 2074 6869 7320 .# Process this 00000050: 6669 6c65 2077 6974 6820 6175 746f 636f file with autoco 00000060: 6e66 2074 6f20 7072 6f64 7563 6520 6120 nf to produce a 00000070: 636f 6e66 6967 7572 6520 7363 7269 7074 configure script 00000080: 2e0a 2320 4f72 6465 7220 6973 206c 6172 ..# Order is lar 00000090: 6765 6c79 2069 7272 6576 656c 6c61 6e74 gely irrevellant 000000a0: 2c20 616c 7468 6f75 6768 2069 7420 6d75 , although it mu 000000b0: 7374 2073 7461 7274 2077 6974 6820 4143 st start with AC 000000c0: 5f49 4e49 5420 616e 6420 656e 6420 7769 _INIT and end wi 000000d0: 7468 2041 435f 4f55 5450 5554 0a23 2053 th AC_OUTPUT.# S 000000e0: 6565 2068 7474 703a 2f2f 6175 746f 746f ee http://autoto 000000f0: 6f6c 7365 742e 736f 7572 6365 666f 7267 olset.sourceforg 00000100: 652e 6e65 742f 7475 746f 7269 616c 2e68 e.net/tutorial.h
I've encountered the same limitations — my specific issue is the need to be able to display characters like U+200B, the zero-width space, and U+200C, the zero-width non-joiner. (Used in electronic texts with such languages as Khmer, which otherwise do not separate words with spaces.) Unlike you, instead of "platform doesn't matter," I need an editor with Windows and Linux versions, and Mac too is desirable.
I haven't found any text editors that will let you display them on-screen, although some (many?) will let you enter them and will properly treat them as characters that can be cut and pasted and whose presence is indicated via cursor movement. (That is, if the screen shows "if" and there are three ZWSP's between the "i" and "f," you have to press the arrow key four times to move from "i" to "f.")
TextPad 4.7.3 is otherwise my text editor of choice, but it is very limited in its acceptance of scripts; and TextPad 5 definitely does not show these invisibles.
I have often resorted to opening my files in OpenOffice.org Writer, which will show a gray slash at these characters' location with invisibles turned on, and Microsoft Word, which displays a double-box (box within a box) character for such invisibles. This double-box has width and changes the line-breaks on-screen, which is not trivial and which I haven't seen in any other editor.
You can also user Notepad ++ to show them. Here is an example, the black boxes are control characters.
I prefer UltraEdit even though it is not free. It is very capable of showing hidden characters, including a robust HEX viewing mode. (I am not affiliated with the publisher, IDM.)
I am not sure as I haven't used it in a while, but I remember that SciTE was a good one that showed me "too much information" for my needs.
Programmer's Notepad on Windows might work.
TextPad (It's nagware, runs on Windows)
I'm not sure which of these will show the hidden characters out of the box, but they're all made for "nerdy" stuff, so I assume that they would work ,at least with a little tweaking. I can verify that Programmer's Notepad does show "hidden" characters.
If you are running a 32 bit version of windows, you can see BOMs and other invisible characters such as carriage return or line feeds that look like a music eighth note in MS Dos Editor which you can open by typing "edit" in the run box or from a command prompt. Unfortunately, the ms dos editor is not available on 64 bit systems :(
精彩评论