开发者

Croatian diacritic signs MySQL DB - like clause

开发者 https://www.devze.com 2023-02-14 00:43 出处:网络
I have MySQL db, db engine InnoDB, collation set to utf8-utf8_general_ci (also tried utf8_unicode_ci). I would like db to treat equaly č and c, ž and z, ć and c, š and s, đ and d.

I have MySQL db, db engine InnoDB, collation set to utf8-utf8_general_ci (also tried utf8_unicode_ci). I would like db to treat equaly č and c, ž and z, ć and c, š and s, đ and d. E.g,

table1

-------------
id  | name
-------------
1   | mačka
2   | đemper
-------------

if I run query:

SELECT * FROM table1 WHERE name LIKE '%mac%'

or

SELECT * FROM table1 WHERE name LIKE '%mač%' I will get the result:

-------------
id  | name
-------------
1   | mačka

Which is OK, that is exactly what I want.

But if run query:

SELECT * 开发者_运维问答FROM table1 WHERE name LIKE '%de%'

I get zero results.

And if I run query:

SELECT * FROM table1 WHERE name LIKE '%đe%'

I will get:

-------------
id  | name
-------------
2   | đemper

This is not behaviour that i would want nor expect. I would like that both (last two queries) returned:

-------------
id  | name
-------------
2   | đemper

How can I accomplish this?

Any kind of help is appreciated, thanks in advance :) !


This can't be done without the use of regular expressions, as there is no collation in MySQL that considers đ equivalent to d.


The collation you are using determines things like this -- what characters are considered 'equal', and what order they should sort in. But first off, you need to know what encoding your table is using.

The command SHOW TABLE STATUS LIKE 'table1'\G should show you that. That will help you determine the collation you need to use.

If it's Unicode (UTF8, e.g.), then you need to set a Unicode collation. There doesn't appear to be one built-in to MySQL for Croatian. You can check the MySQL Character Set manual page to see if anything there is going to be 'close enough'.

If it's iso-latin-2 (iso-8859-2), then you can use 'latin2_croatian_ci' collation.

If it's CP-1250, then there is also a 'cp1250_croatian_ci' collation.

The non-unicode collations are in the manual here.

EDIT As Ignacio Vazquez-Abrams correctly points out, none of the MySQL collations consider 'đ' to be equivalent to 'd'. (Reference for MySQL collations)

If you are really eager to put a lot of time into this, you can also read up on how to install your own custom collation

0

精彩评论

暂无评论...
验证码 换一张
取 消