开发者

Case insensitive unicode collation in MySQL

开发者 https://www.devze.com 2023-01-30 04:30 出处:网络
I\'ve got a database where we store usernames with a capital first letter of each name -- ie, IsaacSparling.I\'m trying to do case insensitive autocomplete against my MySQL (v5.1.46) db.Table has 开发

I've got a database where we store usernames with a capital first letter of each name -- ie, IsaacSparling. I'm trying to do case insensitive autocomplete against my MySQL (v5.1.46) db. Table has 开发者_JS百科a charset of UTF8 and a collation of utf8_unicode_ci. I've done these tests against the utf8_general_ci collation as well.

Plain ASCII text works fine:

mysql> select username from users where username like 'j%';
+----------------+
| username       |
+----------------+
| J********      |
| J***********   |
| J************* |
+----------------+
3 rows in set (0.00 sec)

mysql> select username from users where username like 'J%';
+----------------+
| username       |
+----------------+
| J********      |
| J***********   |
| J************* |
+----------------+
3 rows in set (0.00 sec)

(names redacted, but they're there).

However, when I try to do the same for unicode characters outside the ASCII set, no such luck:

mysql> select username from users where username like 'ø%';
Empty set (0.00 sec)


mysql> select username from users where username like 'Ø%';
+-------------+
| username    |
+-------------+
| Ø*********  |
+-------------+
1 row in set (0.00 sec)

Some investigation has lead me to this: http://bugs.mysql.com/bug.php?id=19567 (tl;dr, this is a known bug with the unicode collations, and fixing it is at 'new feature' priority -- ie, won't be finished in any reasonable timeframe).

Has anybody discovered any effective workarounds that allow for case-insensitive searching for unicode characters in MySQL? Any thoughts appreciated!


Works fine for me with version 5.1.42-community

Maybe your mysql client did not send the unicode characters properly. I tested with sqlYog and it worked just fine with both utf8_unicode_ci and utf8_general_ci collations


IF what you care about is being able to order the field values by the text without caring if it is in upper or lower case I think the best thing you can do is when addressing the field instead of typing just username, type LOWER(username) username and then you can perfectly use an order by that field calling it by its name


Have you tried using CONVERT? Something like

WHERE `lastname` LIKE CONVERT( _utf8 'ø%' USING latin1 )

might work for you.


I just resolved the same problem using the query

show variables like '%char%';

My character_set_client was set to 'utf8', but character_set_connection and character_set_results were set to 'latin1'. Thus, the functions UPPER, LOWER, LIKE did not work as expected.

I just inserted the line

mysql_query("SET NAMES utf8");

right after connection to get the case-insensitive searching work.

0

精彩评论

暂无评论...
验证码 换一张
取 消