开发者

Does Zend Framework application need mbstring for UTF8 support?

开发者 https://www.devze.com 2023-01-12 04:38 出处:网络
I\'m building a web app in zend framework that needs UTF8 support for all languages. This seems to work fine except for functions like stripslashes and such.

I'm building a web app in zend framework that needs UTF8 support for all languages. This seems to work fine except for functions like stripslashes and such.

On this URL, they talk about using MBSTRING http://developer.loftdigital.com/blog/php-utf-8-cheatsheet

Is it necessary to use mbstring on my server and replace ALL occurences of UTF8-incapable functions by their MB-variant?

Isn't Zend Framework suppost to support UTF8 ? If not, we'd have to replace all functions in the ZF-codebase to their mb_ alternatives, right? Which is an impossible task because an upgrade to a new ZF would break our code.

mail()      -> mb_send_mail()
strlen()    -> mb_strlen()  
strpos()    -> mb_strpos()
strrpos()   -> mb_strrpos()
substr()    -> mb_substr()
strtolower()    -> mb_strtolower()
strtoupper()    -> mb_strtoupper()
substr_count()  -> mb_substr_count()
ereg()      -> mb_ereg()
eregi()     -> mb_eregi()
ereg_replace()  -> mb_ereg_replace()
eregi_replace() -> mb_eregi_replace()   
split()     -> mb_split()

What's your advice on this, I might be completely wrong on this? I read about using:

mbstring.func_overload  = 7 ;

to overload all functions automatically.

Will this break an existing application that doesn't need UTF8 or does it "degrade gra开发者_开发知识库cefully"?


Do not, and I can only repeat, do not use mbstring overloading. It will most certainly break any method which, for instance, relies on strlen() returning the number of bytes. All components in Zend Framework expect UTF-8 by default, but can handle different charsets if you tell it to. That is done via iconv_*, which is built into PHP by default, so there are no dependencies on extra libraries like mbstring.

The only thing were you have to tell Zend Framework about UTF-8 is your database connection, which you can simply do via the charset option (see Zend_Db or Zend_Application documentation). You surely also want to tell the user agent which charset you deliver via the content-type header. And don't forget to add accept-charset="utf-8" in your tags.


I don't think overloading all the functions with mb_string would be good , we all know that PHP doesn't handle utf8 natively so we use something like

"SET NAMES utf8" for the database & we use Zendmail + pass the encoding to it as a parameter to let Zend mail manage it self internally

another example is Zend_Validate_StringLength it had a parameter called encoding and it uses iconv in function called :

 public function setEncoding($encoding = null)
    {
        if ($encoding !== null) {
            $orig   = iconv_get_encoding('internal_encoding');
            $result = iconv_set_encoding('internal_encoding', $encoding);
            if (!$result) {
                require_once 'Zend/Validate/Exception.php';
                throw new Zend_Validate_Exception('Given encoding not supported on this OS!');
            }

            iconv_set_encoding('internal_encoding', $orig);
        }

        $this->_encoding = $encoding;
        return $this;
    }

but you would always use mb_string in your app in some logic which is not related to the framework .

for example yesterday i was sorting a utf8 array of post & comments from a database

i couldn't get the job done without using mb string because php doesn't handle utf8 natively :(

i love mb string it made my life easier

EDIT : what i meant to say is use the mbstring whenever you need it , and let the framework manage itself , i don't like overload all functions automatically.


Isn't Zend Framework suppost to support UTF8 ?

I don't know. Just grep through the code searching for strlen for example but you will still need to look at the code to determine if it's used in a context which is not multibyte safe. Quick googling revealed this http://www.iezzi.ch/archives/371 so it seems that ZF is prepared for UTF8 apps.

What's your advice on this, I might be completely wrong on this? I read about using: mbstring.func_overload = 7 ; Will this break an existing application that doesn't need UTF8 or does it "degrade gracefully"?

Of course it will work for non-multibyte strings as well and not break it. But before using it I would suggest to make sure that you really need it because it will cost performance.

0

精彩评论

暂无评论...
验证码 换一张
取 消