I want to format long numbers using thousand separator. It can be done using to_char
function just like:
SELECT TO_CHAR(76543210.98, '999G999G990D00')
But when my PostgreSQL server with UTF-8 encoding is on Polish version of Windows such SELECT ends with:
ERROR: invalid byte sequence for encoding "UTF8": 0xa0
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
In to_char
pattern G
is described as: group separator (uses locale).
This SELECT works without error when server is running on Linux with Polish locale.
As a workaround I use space instead of G
in format string, but I think there should be way to set thousand separator开发者_JS百科 just like in Oracle:
ALTER SESSION SET NLS_NUMERIC_CHARACTERS=', ';
Is such setting available for PostgreSQL?
If you use psql
, you can execute this:
\pset numericlocale
Example:
test=# create temporary table a (a numeric(20,10));
CREATE TABLE
test=# insert into a select random() * 1000000 from generate_series(1,3);
INSERT 0 3
test=# select * from a;
a
-------------------
287421.6944910590
140297.9311533270
887215.3805568810
(3 rows)
test=# \pset numericlocale
Showing locale-adjusted numeric output.
test=# select * from a;
a
--------------------
287.421,6944910590
140.297,9311533270
887.215,3805568810
(3 rows)
I'm pretty sure the error message is literally true: 0xa0 isn't a valid UTF-8 character.
My home server is running PostgreSQL on Windows XP, SP3. I can do this in psql.
sandbox=# show client_encoding;
client_encoding
-----------------
UTF8
(1 row)
sandbox=# show lc_numeric;
lc_numeric
---------------
polish_poland
(1 row)
sandbox=# SELECT TO_CHAR(76543210.98, '999G999G990D00');
to_char
-----------------
76 543 210,98
(1 row)
I don't get an error message, but I get garbage for the separator. Could this be a code page issue?
As a workaround I use space instead of G in format string
Let's think about this. If you use a space, then on a web page the value might split at the end of a line or at the boundary of a table cell. I'd think a nonbreaking space might be a better choice.
And, in Unicode, a nonbreaking space is 0xa0. In Unicode, not in UTF8. (That is, 0xa0 can't be the first byte of a UTF8 character. See UTF-8 Bit Distribution.)
Another possibility is that your client is expecting one byte order, and the server is giving it a different byte order. Since the numbers are single-byte characters, the byte order wouldn't matter until, well, it mattered. If the client is expecting a big endian MB character, and it got a little endian MB character beginning with 0xa0, I'd expect it to die with the error message you saw. I'm not sure I have a way to test this before I go to work today.
精彩评论