I have to create a software that must work on several *nix platforms (Linux, AIX, ...).
I need to handle internationalization and my translation strings are in the following form:
"Hi %1, you are %2." // English
"Vous êtes %2, bonjour %1 !" // French
Here %1
stand for the name, and %2
for another word. I may change the format, that's not an issue.
I tried to use printf()
but you cannot specif开发者_运维百科y the order of the parameters, you just specify their types.
"Hi %s, you are %s"
"Vous êtes %s, bonjour %s !"
Now there is no way to know which parameter to use for replacement of %s
: printf()
just uses the first one, then the next.
Is there any alternative to printf()
that deals with this ?
Note: gettext()
is not an option.
I don't mean to be the bearer of bad tidings but what you're proposing is actually a bad idea. I work for a company that take i18n very seriously and we've discovered (painfully) that you cannot just slot words into sentences like that, since they often make no sense.
What we do is to simply disconnect the error text from the variable bits altogether, so as to avoid these problems. For, example, we'll generate an error:
XYZ-E-1002 Frobozz not configured for multiple zorkmids (F22, 7).
And then, in the description of the error, you state simply that the two values in the parentheses at the end were the Frobozz identifier and the number of zorkmids you tried to inflict on it.
This leaves i18n translation as an incredibly easy task since you have, at translation time, all of the language elements you need without worrying whether the variable bits should be singular or plural, masculine or feminine, first, second, or third declension (whatever the heck that actually means).
The translation team simply has to convert "Frobozz not configured for multiple zorkmids"
and that's a lot easier.
For those who would like to see a concrete example, I have something back from our translation bods (with enough stuff changed to protect the guilty).
At some point, someone submitted the following:
The {name} {object} is invalid
where {name}
was the name of a object (customers, orders, etc) and {object}
was the object type itself (table, file, document, stored procedure, etc).
Simple enough for English, the primary (probably only) language of the developers, but they struck a problem when translating to German/Swiss-German.
While the "customers document" translated correctly (in a positional sense) to Kundendokument
, the fact that the format string had a space between the two words was an issue. That was basically because the developers were trying to get the sentence to sound more natural but, unfortunately, only more natural based on their limited experience.
A bigger problem was with the "customers stored procedure" which became gespeichertes Verfahren der Kunden
, literally "stored procedure of the customers". While the German customers may have put up with a space in Kunden dokument
, there is no way to impose gespeichertes Verfahren der Kunden
onto {name} {object}
successfully.
Now you may say that a cleverer format string would have fixed this but there are several reasons why that would be incorrect:
- this is a very simple example, there are likely to be others more complex (I'd try get some examples but our translation bods have made it clear they have more pressing work than to submit themselves to my every whim).
- the whole point of the format strings is to externalise translation. If the format strings themselves are specific to the translation target, you've gained very little by externalising the text.
- developers should not have to concern themselves with format strings like
{possible-pre-adjectives} {possible-pre-owner} {object} {possible-post-adjectives} {possible-post-owner} {possible-postowner-adjectives}
. That is the job of the translation teams since they understand the nuances.
Note that introducing the disconnect sidesteps this issue nicely:
The object specified by <parameter 1>, of type <parameter 2>, is invalid. Parameter 1 = {name}. Parameter 2 = {object}. Der sache nannte <parameter 1>, dessen art <parameter 2> ist, ist falsch. Parameter 1 = {name}. Parameter 2 = {object}.
That last translation was one of mine, please don't use it to impugn the quality of our translators. No doubt more fluent German speakers will get a good laugh out of it.
POSIX printf()
supports positional arguments.
printf("Hi %1$s, you are %2$s.", name, status);
printf("Vous êtes %2$s, bonjour %1$s !", name, status);
boost.format supports this the way like in python however this is for C++
You want the %n$s extension that is common to most Unix systems.
"Hi %1$s, you are %2$s."
See the German example at the bottom printf
Regards DaveF
精彩评论