So to support multilanguage on the site I need to make sure all data going into the database is stored as utf8. My question is, is there a 开发者_Python百科class out there that does form checks and data sanitization for forms using utf8 formatted form data? Right now I do checks such as if empty, or if the form data is a certain length, but would have to use different commands because of utf8. So just wanted to check and see if there is a pre-existing class for this type of checking/sanitizing as I try not to re-invent the wheel.
Some general information about the topic can be found in Building Scalable Web Sites by Cal Hernderson (O'Reilly 2006) in chapter 5 (Chapter 5 as PDF).
My question is, is there a class out there that does form checks and data sanitization for forms using utf8 formatted form data?
This shouldn't be necessary in the first place. If your form (and everything else along the way) is properly encoded as UTF-8, there should be no hiccups.
If everything is properly set up, only thing where things can go wrong is when the user enters invalid characters into the form. It's impossible to reliably defend against that - but the risk of this happening is minimal.
If you have a real-world situation where invalid characters can make it into the data, you can do an iconv()
with the //IGNORE
option to weed out invalid characters:
$data = iconv("UTF-8", "UTF-8//IGNORE", $data);
the same way, you can find out whether a string contains invalid characters by comparing string lengths before and after the iconv().
精彩评论