开发者

I get dual results from mysql query when using international characters, i.e Å/Ä=A & Ö=O,

开发者 https://www.devze.com 2023-01-22 10:31 出处:网络
For example if I search for the name Åsa i only want to get the name Åsa and not Asa, same with Björn instead of Bjorn

For example if I search for the name Åsa i only want to get the name Åsa and not Asa, same with Björn instead of Bjorn

$query="select * from users where username like 'Björn'";
$result=mysql_query($query);
$num=mysql_num_rows($result);
echo"$num";
$i=0;
while($i<$num){     
$id=mysql_result($result,$i,"id");
$name=mysql_result($result,$i,"username");    
echo"<br/>$id,$name";
$i++;
}

Results

34,Björn
67,Bjorn

only result 34 is supposed to be displayed

I am using

mysql_query("SET NAMES utf8");
mysql_开发者_高级运维query( "SET CHARACTER SET utf8");

The database, table and column are set to utf8_unicode_ci


Your "problem" is the utf8_unicode_ci collation. That collation does "character expansions", meaning that Umlauts and their base characters are treated as the same even in a = comparison:

A = Ä
O = Ö
...

The second example on this mySQL manual page explains the issue: 9.1.7.8. Examples of the Effect of Collation

What you would have to do is either switch to a collation that distinguishes between umlaut and base character (e.g. utf8_general_ci or utf8_general_bin) or switch to a different collation only when doing the comparison:

select * from users where username like 'Björn' COLLATE utf8_general_ci;

this is obviously slower, because the collation conversion has to be done for each record during the query.


The database, table and column are set to utf8_unicode_ci

This collation deliberately makes many accented characters match their unaccented bases. Often that's what you want, sometimes it isn't. When it isn't, you have to pick a collation that best fits the rules of the language you're working with. Here's some tables of MySQL collations you can search to try to find something suitable.

utf8_swedish_ci is a commonly-used possibility that doesn't collate A/Å/Ä and O/Ö together, though other accented characters are still collated together. Unfortunately there isn't a good collation I can see that knows about a wide range of European accented letters but still treats them as different. If you don't need case-insensitive matching at all, you can of course use utf8_bin.


Don't use a LIKE query when what you mean is =.

$query="select * from users where username = 'Björn'";
0

精彩评论

暂无评论...
验证码 换一张
取 消