Update
As nobody has still given a good enough answer, I will reformulate it:
How would I know within a shell script in Linux if there's support in the current bitmap font for a given unicode character?
That's it: not within gnome-terminal with its fancy TTF support and different charsets: the plain metal tty without X.org
Old question (may clear out 开发者_开发知识库something in the above sentences)
I am writing a program with special characters in it. Characters like װאבדג (hebrew).
Using some Ubuntu I had handy here I could get them to work inside the X environment (inside gnome-terminal). In rxvt, I get strange characters instead of what I have in the file; and in bare xterm I get some of them.
The file itself may be just as simple as
letters="⅄ႥႣႬזלבגװאבדגהוזחטענסףמלךלכפץצקႠႣႤႥႬႫႹჄႾႨ"
letters=$(echo $letters | sed -e 's/./\0\n/g')
letters=$(for i in $letters; do echo "$RANDOM$i" done | sort -rn | sed -e 's/[0-9]*//g')
echo $letters
In OS X it just shows "nnnnnnnnnnnnnnnnnnnn".
Within the tty without X.Org started, it just shows a diamond.
In all the terms, I have
LANG=es_ES.UTF-8
Is there any way to know within the script if the chars will be shown correctly (I could implement some fallback if so), or if we can set the terminal to show it.
You have a bug here:
echo $letters | sed -e 's/./\0\n/g'
EDIT (Since you mention you are on OS X I removed the part talking about GNU Sed)
With the version of set built in to OS X, \0\n
means "0n" (the character zero and the character n).
You are replacing every character in your input, so you should not be surprised that you are not seing them in the output.
On Mac OS X you can check Terminal.app for UTF-8 readiness:
defaults read com.apple.Terminal StringEncoding # 4
defaults read com.apple.Terminal DoubleWideChars # YES
Furthermore, Mac OS X uses FreeBSD sed which does not accept \0
.
printf "%s" "$letters" | sed $'s/./&\\\n/g'
printf "%s" "$letters" | gsed $'s/./&\\\n/g'
printf "%s" "$letters" | awk -vFS="" '{for(i=1;i<=NF;i++) print $i}'
# randomize letters
letters=$(echo $letters | sed $'s/./&\\\n/g')
# note the additional ";" after "${RANDOM}${i}"
letters=$(for i in $letters; do echo "${RANDOM}${i}"; done | sort -rn | sed -e 's/[0-9]*//g')
echo $letters
You may at least check if your current terminal emulator encoding is set to handle UTF-8 characters. And if this is the case your current bitmap font should support UTF-8 encoded characters as well.
LC_ALL= locale charmap # UTF-8
The value of the $TERM
environment variable may also give a hint whether your current terminal is capable of handling UTF-8 characters, e. g. rxvt vs urxvt.
And last but not least you may play around with tools such as tconv
, ttyconv
or luit
to convert to & from UTF-8.
See:
- Re: A call for fixing aterm/rxvt/etc... (tconv)
- "tail -f | iconv -fsjis" does not output anything (ttyconv, luit)
精彩评论