I created a clean function in PHP for a project to help construct useful URLs from database content. It removes any spaces and special characters, so that a sentence like "My Motörhead Albums" becomes in the URL my-motoerhead-albums. However, it seems to not correctly convert the umlauts like ö,ä,ü, etc, and I can't figure out why.
Here's the code:
function clean($text) {
$text = trim($text);
$text = strtolower($text);
$code_entities_match = array(
' ', '--', '"', '!', '@', 开发者_Python百科'#', '$', '%', '^', '&',
'*', '(', ')', '_', '+', '{', '}', '|', ':', '"',
'<', '>', '?', '[', ']', '\\', ';', "'", ',', '.',
'/', '*', '+', '~', '`', '=', '¡', '¿', '´', '%C2%B4',
'ä', 'ö', 'ü', 'ß', 'å', 'á', 'à',
'ó', 'ò', 'ú', 'ù', 'í', 'é', 'è', 'ø', 'Þ', 'ð', '%C3%9E', 'þ'
);
$code_entities_replace = array(
'', '-', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '',
'ae', 'oe', 'ue', 'ss', 'aa', 'a', 'a', 'o', 'o', 'u', 'u', 'i', 'e', 'e', 'oe', 'th', 'th', 'th', 'th'
);
$text = str_replace($code_entities_match, $code_entities_replace, $text);
return $text;
}
This is the function I use to build url-safe strings:
static public function slugify($text)
{
$text = str_replace(" ", "_", $text);
// replace non letter or digits by -
$text = preg_replace('~[^\\pL\d_]+~u', '-', $text);
// trim
$text = trim($text, '-');
// transliterate
$text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);
// lowercase
$text = strtolower($text);
// remove unwanted characters
$text = preg_replace('~[^-\w]+~', '', $text);
if (empty($text))
{
return 'n-a';
}
return $text;
}
It was taken from symfony's Jobeet tutorial.
精彩评论