We're letting users search a database from a single text input and I'm having difficulties in filtering some user supplied strings.
For example, if the user submits:
��������� lc开发者_StackOverflow社区d SONY
(Note the ?'s) I need to cancel the search.
I include the base64 encoded version of the above string wrapped up so that its easy run:
print(base64_decode("1MfLxc/RwdPHIGxjZCBTT05Z"));
I've ignored such inputs before but now (am not sure why) just realised the mysql database query is taking nearly forever to execute so this is now on high priority.
Another example to highlight that we are using utf-8 and mb_detect_encoding is not helping much:
print(base64_decode("zqDOm8+Fzr3PhM63z4HOuc6/IM+Bzr/Phc+HzyU="));
ΠΛυντηριο ρουχ�%
So:
- how can I detect/filter these inputs?
- how is this input being generated?
You shouldn't be getting that, although if you really want to filter (which I don't reccommend), do a check for alphanumeric as well as "-.;", etc.
You can use some of these functions to help you in the filtering process.
http://www.php.net/manual/en/function.ctype-alnum.php
If you execute these queries after creating the connection to mysql, it should handle utf-8 input and results just fine without spitting out ?'s.
mysql_query("SET character_set_client=utf8", $mysqlConn);
mysql_query("SET character_set_connection=utf8", $mysqlConn);
mysql_query("SET character_set_results=utf8", $mysqlConn);
(assuming the database is set to utf-8 and you don't mind not filtering them if they don't turn into ?'s)
(also assuming you are using mysql, other dbms probably have similar functions)
精彩评论