开发者

php - htmlspecialchars with unicode

开发者 https://www.devze.com 2023-02-21 17:55 出处:网络
$string = \"Główny folder grafik<p>asd nc</p>\"; echo htmlspecialchars($string); on live site
    $string = "Główny folder grafik<p>asd nc</p>";

echo htmlspecialchars($string);

on live site

G&#322;ówny folder grafik<p>asd nc</p>

on local

Głów开发者_C百科ny folder grafik<p>asd nc</p>

what is problem ? i want when run on live site result look like local


htmlspecialchars() accepts additional parameters -- the third one being the charset.

Try specifying that third parameter.


You need to add extra parameters to the htmlspecialchars() function. The following should work:

htmlspecialchars($string, ENT_QUOTES, "UTF-8");


You may want to pass an optional parameter to htmlspecialchars about charset which is ISO-8859-1 by default.


If you require all strings that have associated named entities to be translated, use htmlentities() instead, that function is identical to htmlspecialchars() in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities.

but even htmlentities() does not encode all unicode characters. It encodes what it can [all of latin1], and the others slip through (e.g. `Љ).

This function consults an ansii table to custom include/omit chars you want/don't.

(note: sure it's not that fast)

/**
 * Unicode-proof htmlentities.
 * Returns 'normal' chars as chars and weirdos as numeric html entites.
 * @param  string $str input string
 * @return string      encoded output
 */
function superentities( $str ){
    // get rid of existing entities else double-escape
    $str = html_entity_decode(stripslashes($str),ENT_QUOTES,'UTF-8');
    $ar = preg_split('/(?<!^)(?!$)/u', $str );  // return array of every multi-byte character
    foreach ($ar as $c){
        $o = ord($c);
        if ( (strlen($c) > 1) || /* multi-byte [unicode] */
            ($o <32 || $o > 126) || /* <- control / latin weirdos -> */
            ($o >33 && $o < 40) ||/* quotes + ambersand */
            ($o >59 && $o < 63) /* html */
        ) {
            // convert to numeric entity
            $c = mb_encode_numericentity($c,array (0x0, 0xffff, 0, 0xffff), 'UTF-8');
        }
        $str2 .= $c;
    }
    return $str2;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消