开发者

What's the best database structure to keep multilingual data? [duplicate]

开发者 https://www.devze.com 2022-12-19 10:01 出处:网络
This question already has answers here: Closed 10 years ago. Possible Duplicate: Schema for a multilanguage database
This question already has answers here: Closed 10 years ago.

Possible Duplicate:

Schema for a multilanguage database

Here's an example:

[ products ]
id (INT)
name-en_us (VARCHAR)
name-es_es (VARCHAR)
name-pt_br (VARCHAR)
description-en_us (VARCHAR)
description-es_es (VARCHAR)
description-pt_br (VARCHAR)
price (DECIMAL)

The problem: every new language will need modify the table structure.

Here's another example:

[ products-en_us ]
id (INT)
name (VARCHAR)
description (VARCHAR)
price (DECIMAL)

[ products-es_es ]
id (INT)
name (VARCHAR)
description (VARCHAR)
price (DECIMAL)

The problem: every new language will need the creation of new tables and the "price" field is duplicated in every table.

Here's another example:

[ lang开发者_开发百科uages ]
id (INT)
name (VARCHAR)

[ products ]
id (INT)
price (DECIMAL)

[ translation ]
id (INT, PK)
model (VARCHAR) // product
field (VARCHAR) // name
language_id (INT, FK) 
text (VARCHAR)

The problem: hard?


Similar to method 3:

[languages]
id (int PK)
code (varchar)

[products]
id (int PK)
neutral_fields (mixed)

[products_t]
id (int FK)
language (int FK)
translated_fields (mixed)
PRIMARY KEY: id,language

So for each table, make another table (in my case with "_t" suffix) which holds the translated fields. When you SELECT * FROM products, simply ... LEFT JOIN products_t ON products_t.id = products.id AND products_t.language = CURRENT_LANGUAGE.

Not that hard, and keeps you free from headaches.


Your third example is actually the way the problem is usually solved. Hard, but doable.

Remove the reference to product from the translation table and put a reference to translation where you need it (the other way around).

[ products ]
id (INT)
price (DECIMAL)
title_translation_id (INT, FK)

[ translation ]
id (INT, PK)
neutral_text (VARCHAR)
-- other properties that may be useful (date, creator etc.)

[ translation_text ]
translation_id (INT, FK)
language_id (INT, FK) 
text (VARCHAR)

As an alternative (not especially a good one) you can have one single field and keep all translations there merged together (as XML, for example).

<translation>
  <en>Supplier</en>
  <de>Lieferant</de>
  <fr>Fournisseur</fr>
</translation>


In order to reduce the number of JOIN's, you could keep separate the translated and non translated in 2 separate tables :

[ products ]
id (INT)
price (DECIMAL)

[ products_i18n ]
id (INT)
name (VARCHAR)
description (VARCHAR)
lang_code (CHAR(5))


At my $DAYJOB we use gettext for I18N. I wrote a plugin to xgettext.pl that extracts all English text from the database tables and add them to the master messages.pot.

It works very well - translators deal with only one file when doing translation - the po file. There's no fiddling with database entries when doing translations.


[languages] id (int PK) code (varchar)

[products]
id (int PK)
name
price
all other fields of product
id_language ( int FK )

I actually use this method, but in my case, it's not in a product point of view, for the various pages in my CMS, this work's quite well.

If you have a lot of products it might be a headache to update a single one in 5 or 6 languages... but it's a question of working the layout.


What about fourth solution?

[ products ]
id (INT)
language (VARCHAR 2)
name (VARCHAR)
description (VARCHAR)
price (DECIMAL)
*translation_of (INT FK)*

*Translation_of* is FK of it self. When You add default language *translation_of* is set to Null. But when you add second language *translation_of* takes primary produkt language id.

SELECT * FROM products WHERE id = 1 AND translation_of = 1

In that case we get all translations for product with id is 1.

SELECT * FROM products WHERE id = 1 AND translation_of = 1 AND language = 'pl'

We get only product in Polish translation. Without second table and JOINS.


Have many to many relationship.

You have your data table, languages table and a data_language table.

In the data_language table you have

id, data_id, language_id

I think that might work best for your.


We use this concept for our webiste (600k views per day) and (maybe surprisingly) it works. Sure along with caching and query optimalization.

[attribute_names]
id (INT)
name (VARCHAR)

[languages_names]
id (INT)
name (VARCHAR)

[products]
id (INT)
attr_id (INT)
value (MEDIUMTEXT)
lang_id (INT)
0

精彩评论

暂无评论...
验证码 换一张
取 消