开发者

PHP urlize function

开发者 https://www.devze.com 2022-12-31 21:16 出处:网络
I\'m using this funct开发者_如何学Goion on my website to transform user input into acceptable URL:

I'm using this funct开发者_如何学Goion on my website to transform user input into acceptable URL:

function urlize($url) { 
$search = array('/[^a-z0-9]/', '/--+/', '/^-+/', '/-+$/' ); 
$replace = array( '-', '-', '', ''); 
return preg_replace($search, $replace, utf2ascii($url)); 
}     
function utf2ascii($string) { 
$iso88591  = "\\xE0\\xE1\\xE2\\xE3\\xE4\\xE5\\xE6\\xE7"; 
$iso88591 .= "\\xE8\\xE9\\xEA\\xEB\\xEC\\xED\\xEE\\xEF"; 
$iso88591 .= "\\xF0\\xF1\\xF2\\xF3\\xF4\\xF5\\xF6\\xF7"; 
$iso88591 .= "\\xF8\\xF9\\xFA\\xFB\\xFC\\xFD\\xFE\\xFF"; 
$ascii = "aaaaaaaceeeeiiiidnooooooouuuuyyy"; 
return strtr(mb_strtolower(utf8_decode($string), 'ISO-8859-1'),$iso88591,$ascii); 
}

I'm having a problem with it though, with numbers. For some reason when I try:

echo urlize("test 23342");

I get "test-eiioe". Why is that and how can I fix it?

Thank you very much!


The problem is in your utf2ascii. I suggest you to use iconv() function instead.

iconv("UTF-8", "ISO-8859-1//IGNORE", $string);

The //IGNORE part in the output encoding means to ignore any character it can't translate. The bad news is you lose all accented characters. To keep them, you can use //TRANSLIT.

Then, you can use strtolower and some regexp to eliminate non-alphanumeric characters (or to replace them with -).

If you want to encode any data, there is also urlencode(), but this won't make you nice links.


Hey, it looks like you are trying to create a slug. If so, this is the function I use/suggest:

function slug( $string ) {
    return strtolower( preg_replace( array( '/[^-a-zA-Z0-9\s]/', '/[\s]/' ), array( '', '-' ), $string ) );
}


What's wrong with urlencode()?


Your utf2ascii function is wrong, that's the one turning test 23342 into test eiioe.

Why don't you use iconv to do the conversion from UTF-8 to ISO-8859-1? ie. use iconv("UTF-8", "ISO-8859-1//TRANSLIT", $url);


I added accented character replacing on Maxime Michel's answer:

function urlize($url) {
    $search = array('/[^a-z0-9]/', '/--+/', '/^-+/', '/-+$/' );
    $replace = array( '-', '-', '', '');
    $unwanted_array = array(    'Š'=>'S', 'š'=>'s', 'Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
                    'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U',
                    'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss', 'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c',
                    'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o',
                    'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y' );
    $url = strtr( $url, $unwanted_array );
    $url = strtolower(iconv("UTF-8", "ISO-8859-1//TRANSLIT", $url));
    return preg_replace($search, $replace, $url);
}
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号