开发者

sort() for Japanese

开发者 https://www.devze.com 2023-02-27 04:27 出处:网络
If I have set my current locale to Japanese, how can I make it so that Japanese characters will always have higher preference than non-Japanese characters. For example, right now English characters wi

If I have set my current locale to Japanese, how can I make it so that Japanese characters will always have higher preference than non-Japanese characters. For example, right now English characters will always appear before the Katakana characters. How can I reverse this effect?

Sorry for not being very clear. As you can see here.

The final 开发者_StackOverflowresults have Java, NVIDIA and Windows ファイアウォール. Ranked as the first three ahead of the Japanese characters. Is it possible to have those at the end?


Use usort() instead of sort() so you can define comparing criteria at your own way.

Try this simple method. I have tried it with example from here, and it works.

  function mccompare($a, $b) {
    $fca = ord(substr($a, 0, 1)); $fcb = ord(substr($b, 0, 1));
    if (($fca >= 127 && $fcb >= 127) || ($fca < 127 && $fcb < 127))
      $res = $a > $b ? 1 : -1; 
    else 
      $res = $a > $b ? -1 : 1;
    return $res;
    }

  usort ($your_array, "mccompare");

So for this example

  setlocale(LC_COLLATE, "jpn");

  $your_array = array ("システム", "画面", "Windows ファイウォール",
      "インターネット オプション",  "キーボード", "メール", "音声認識", "管理ツール",
      "自動更新", "日付と時刻", "タスク", "プログラムの追加と削除", "フォント",
      "電源オプション", "マウス", "地域と言語オプション", "電話とモデムのオプション",
      "Java", "NVIDIA");

  usort ($your_array, "mccompare");
  print_r($your_array);

it returns array like

Array
(
    [0] => インターネット オプション
    [1] => キーボード
    [2] => システム
    [3] => タスク
    [4] => フォント
    [5] => プログラムの追加と削除
    [6] => マウス
    [7] => メール
    [8] => 地域と言語オプション
    [9] => 日付と時刻
    [10] => 画面
    [11] => 管理ツール
    [12] => 自動更新
    [13] => 電源オプション
    [14] => 電話とモデムのオプション
    [15] => 音声認識
    [16] => Java
    [17] => NVIDIA
    [18] => Windows ファイウォール
)

Note: This is just my quick solution for this problem, and it's not a perfect solution. It's based on checking first byte in comparing strings, but you can always push some effort in it and improve this function to check all multi-byte characters against Unicode and then decide if $a<=$b or $a>$b.

Hope it works for you!


Ultimately, PHP's sort() leaves it to the underlying libc to implement the sort. And as shown in the article and my comment, not all libcs sort the same way. If you need to present a consistent collation then you will need to use something such as Collator which uses a third-party library instead.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号