开发者

preg_replace: wildcards do not match umlaut-characters

开发者 https://www.devze.com 2022-12-28 05:37 出处:网络
i want to filter a String by using the \\w wildcard, but unfortunately it does not cover umlauts. $i = \"Die Höhe\";

i want to filter a String by using the \w wildcard, but unfortunately it does not cover umlauts.

$i = "Die Höhe";    
$x = preg_replace("/[^\w\s]/","",$i);
echo $x; // "Die Hhe";

However, i can add all the characters to preg_replace, but this is not very elegant, since the list will become very long. ATM, i am preparing this only for German, but there are more languages to come.

$i = "Die Höhe";    
$x = preg_replace("/[^\w\säöüÄÖÜß]/","",$i);
echo $x; // "Die Höhe";

Is开发者_JAVA百科 there a way to match all of them at once?


You strings are obviously UTF-8, so you want the 'u' flag and unicode properties instead of \w

$x = preg_replace('/[^\p{L}\p{N} ]/u',"",$i);


this should remove all, in my opinion, non meaningful chars:

$val = "Die Höhe";
$val = preg_replace('/[^\x20-\x7e\xa1-\xff]+/u', '', $val);
echo $val; // "Die Höhe"
0

精彩评论

暂无评论...
验证码 换一张
取 消