开发者

Why my php substr() shows obscure characters when cutting a text?

开发者 https://www.devze.com 2022-12-17 03:11 出处:网络
I\'m using the substr() function to lim开发者_StackOverflow中文版it the characters in strings. but sometimes, the output text contains some obscure characters and Question marks etc...

I'm using the substr() function to lim开发者_StackOverflow中文版it the characters in strings. but sometimes, the output text contains some obscure characters and Question marks etc...

the text which is "substred" is already UTF8 encoded, and NOT in html entities to make like this problem.

Thanks


Because you are cutting your characters into half.

Use mb_substr for multibyte character encodings like UTF-8. substr just counts bytes while mb_substr counts characters.


The reason is that you use UTF-8, it's multibyte encoding,and substr() works with singlebyte only! htmlentities() doesn't matter.

You SHOULD use mb_substr() http://php.net/manual/en/function.mb-substr.php and other multibyte functions


Just to extend the Gurmbo is answer. Using mb_substr will solve your problem but still if special characters comes at the end when you trip, it still shows the some special characters. So when I did some research, wordpress having method wp_html_excerpt to solve this problem.

wp_html_excerpt method removes those special characters from the end of line.

Here is the source code from wordpress.

/**
 * Safely extracts not more than the first $count characters from html string.
 *
 * UTF-8, tags and entities safe prefix extraction. Entities inside will *NOT*
 * be counted as one character. For example & will be counted as 4, < as
 * 3, etc.
 *
 * @since 2.5.0
 *
 * @param string $str   String to get the excerpt from.
 * @param int    $count Maximum number of characters to take.
 * @param string $more  Optional. What to append if $str needs to be trimmed. Defaults to empty string.
 * @return string The excerpt.
 */
function wp_html_excerpt( $str, $count, $more = null ) {
    if ( null === $more )
        $more = '';
    $str = wp_strip_all_tags( $str, true );
    $excerpt = mb_substr( $str, 0, $count );
    // remove part of an entity at the end
    $excerpt = preg_replace( '/&[^;\s]{0,6}$/', '', $excerpt );
    if ( $str != $excerpt )
        $excerpt = trim( $excerpt ) . $more;
    return $excerpt;
}


If you have encoding problems you can also apply the html_entity_decode() function that convert all HTML entities to their applicable characters. For example:

echo substr(html_entity_decode($string_to_cut), 0, 28) . "...";

That also should work.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号