Storing i18n data in a database using XML_问答_开发者

Storing i18n data in a database using XML

开发者 https://www.devze.com 2023-01-01 20:25 出处：网络

I may have to store some i18n-ed data in my database using XML if I don\'t fight back. That\'s not my choice, but it\'s in the specifications I have to follow. We would have, by example, something lik

I may have to store some i18n-ed data in my database using XML if I don't fight back. That's not my choice, but it's in the specifications I have to follow. We would have, by example, something like following in a 'Country' column:

<lang='fr'>Etats-Unis</lang>
<lang='en'>United States</lang>

This would apply to many columns in the database.

I don't think it's a good idea at all. I tend to think that a cell in a database should represent a single piece of data (better for look-up), and that the database should have two dimensions maximum and not 3 or more (one request more would be required per dimension / a dimension here would be equal to the number of XML attributes).

My idea was to have a separate table for all the translations, with columns such as : ID / Language / Translation. However, I should admit that I'm really not sure what is the best way to store data in various开发者_高级运维 languages in a DB...

Thanks for your advices :)

Agree with your vision of thinking.

In a previous project, we did the same thing (stored I18N values in DB), in addition to traditional I18N (resource files in solution).

The reason for this was for special dynamic data (not static strings) that required translation. We had sports - such as "Soccer" which needed to be translated to their language-specific description.

To do this, we had a simple lookup table, with columns (LCID - int, SportID - int, Description - nvarchar(256)).

The stored procedures/functions accepted the locale as an integer (e.g Spanish - 3082), then the SQL would return the appropriate culture-sensitive description based on the locale. This way the web server code had no knowledge of this behind the scenes.

So you might have the following few rows:

Lcid       SportID      Description
3081       1            Soccer
3082       1            Futbol

Then you simply join on this table in your SQL to facilitate the I18N.

The small drawback of this method though is that your SQL SP's will require the LCID as a parameter from the web server code.

The Sports were updated/imported via an excel sheet bulk import, that's why we needed the I18N in the DB.

This is the only way i can envision DB-based I18N from occuring.

While XML is possibly not the best type to store it in - although depending on DBMS support it may be not be as bad for performance as it first looks - the concept of translations as a data type, rather than normalisation, problem has some merit.

My original thoughts on the topic are here, and I still stand by them after my colleagues decided to use them in the real world, although I may yet try to write a more efficient implementation.

Basically, if you think of translated strings as "representations" of a single "fact", then normalisation ceases to be the right tool for the job. Instead, you can think of "translated string" as an atomic type, with operations for getting at the right (or best available) translation, adding a new translation, etc.

The notion of what is "atomic" is a tricky one, philosophically. I talk in my blog post about whether a single string is "atomic" - it can be de-composed into individual characters, and you might even want to know what the first character is, but in terms of the realm you are modelling, that is generally considered representational detail, and the whole string is an atom. Less extremely, PostgreSQL includes a "point" type, which is basically just an array of two numbers, but allows for operations such as "contains" and "is above" to be expressed directly.