开发者

charsets in MySQL replication

开发者 https://www.devze.com 2023-01-03 17:25 出处:网络
What can I do to ensure that replication will use latin1 instead of utf-8? I\'m migrating between an MySQL 5.1.22 server (master) on a Linux system and a MySQL 5.1.42 server (slave) on a FreeBSD sys

What can I do to ensure that replication will use latin1 instead of utf-8?

I'm migrating between an MySQL 5.1.22 server (master) on a Linux system and a MySQL 5.1.42 server (slave) on a FreeBSD system. My replication works well, but when non-ascii characters are in my varchars, they turn "weird". The Linux/MySQL-5.1.22 shows the following character set variables:

character_set_client=latin1
character_set_connection=latin1
character_set_database=latin1
character_set_filesystem=binary
character_set_results=latin1
character_set_server=latin1
character_set_system=utf8
character_sets_dir=/usr/share/mysql/charsets/
collation_connection=latin1_swedish_ci
collation_database=latin1_swedish_ci
collation_server=latin1_swedish_ci

While the FreeBSD shows

character_set_client=utf8
character_set_connection=utf8
character_set_database=utf8
character_set_filesystem=binary
character_set_results=utf8
character_set_serve开发者_运维知识库r=utf8
character_set_system=utf8
character_sets_dir=/usr/local/share/mysql/charsets/
collation_connection=utf8_general_ci
collation_database=utf8_general_ci
collation_server=utf8_general_ci

Setting any of these variables from the MySQL CLI has no effect, and setting them in my.cnf or at the command line makes the server not start.

Of course, both servers have the tables in question created the same way, in this case with DEFAULT CHARSET=latin1. Let me give you an example:

CREATE TABLE `test` (
  `test` varchar(5) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1

When I on the master do, in a Latin1 terminal, "INSERT INTO test VALUES ('æøå')", this becomes on the slave, when I select it from a Latin1 based terminal

+--------+
| test   |
+--------+
| æøå    |
+--------+

On a UTF-8 based terminal on the replication slave, test contains:

+--------+
| test   |
+--------+
| æøå    |
+--------+

So my conclusion is that it is converted to utf8, even though the table definition is latin1. Is this a correct conclusion?

Of course, on the master, in a latin1 terminal, it still says:

+------+
| test |
+------+
| æøå  | 
+------+

Since both system character sets are utf-8, if I set both terminals to utf-8 and do again "INSERT INTO test VALUES ('æøå')" on the master with a utf-8 terminal, on the slave with utf-8 I get:

+------------+
| test       |
+------------+
| æøà     |
+------------+

If my conclusion is correct, all my replicated data is converted to utf8 (if it is utf8, it is treated as latin1 and converted to utf8), while all the old data in the table is, as the CREATE TABLE suggests, latin1. I'd love to convert it all to utf-8 if it weren't for the fact that legacy applications rely on it being latin1, so I need to keep it in latin1 while they still exist.

What can I do to ensure that the replication reads latin1, treats it as latin1 and writes it on the slave as latin1?

Cheers

Nik


replication between servers where global character_set_% and collation% parameters are different isn't supported.

http://dev.mysql.com/doc/refman/5.6/en/replication-features-charset.html

-- on both servers check the output of...
SHOW VARIABLES LIKE 'char%';
SHOW VARIABLES LIKE 'collat%';

not only can replication fail if sets & collations are different, but it can result in different sort orders and character loss during conversion sets/collations. sort order can impact things like insert / update if using statement based replication.

you're best off configuring the new server to use the same sets and collations as the old server. this will ensure replication works properly. you'll also want to make sure that database, tables and columns all have the same collations between master and slave. once you migrate to the new server you can modify set & collation with tools like 5.6 online schema change or pt-online-schema-change from percona toolkit.

i also recommend running percona's pt-table-checksum to make sure your tables haven't diverged during replication or initial export/import.

see here for more information about impact of differences:

  • http://dev.mysql.com/doc/refman/5.6/en/replication-features-charset.html
  • What's the difference between utf8_general_ci and utf8_unicode_ci
  • http://forums.mysql.com/read.php?103,187048,188748#msg-188748
  • http://dev.mysql.com/doc/refman/5.6/en/charset-unicode-sets.html
  • https://dba.stackexchange.com/questions/8006/whats-the-differences-between-utf8-general-ci-and-utf8-unicode-ci-and-utf8-bina

to anyone who is using Amazon RDS, keep in mind the default mysql 5.6 settings use mixed utf8(mb3) and latin1 (for server and database). you should override those with a custom parameter group if replication from non-RDS to/from RDS (matching source/destination servers).


In general, you must use the exact same configuration file and version of mysql on the slave (except during upgrades / migration scenarios, and a few things which need to be different on slaves like server_id).

You will want to script your database setup so that your DB servers are part of your software deployment. It is essential that all database servers, including those in non-production environments, use the exact same configuration.

Failure to sync the configs will result in unexpected bugs.

I don't know why you feel the need to run different OSs on your different servers, but you're going to make life more difficult for your Ops staff if you do so.

0

精彩评论

暂无评论...
验证码 换一张
取 消