I'd like to take user input, 开发者_开发技巧denoted as $dangerous_string, and use it as part of a RegEx in a MySQL query.
What's the best way to go about doing this? I want to use the user's string as a literal -- if it contains any characters that mean something in MySQL RegEx, those characters should not actually affect my Regular Expression.
$dangerous_string = $_GET["string"];
//do something here
$dangerous_string = what_goes_here($dangerous_string);
$sql = "SELECT * FROM table WHERE search_column REGEX '" . $mysqli->real_escape_string("[[:<:]]$dangerous_string") . "'";
//etc....
Use preg_quote
and replace &
manually
preg_quote
takes a string and escapes special characters with a backslash. It is meant for PHP regexes, not MySQL regexes, and it does not escape &
, which is needed for MySQL. So we only need to modify it like so:
function escape_regex_for_mysql($dangerous_string) {
return preg_replace('/&/', '\\&', preg_quote($dangerous_string));
}
Note that you should still use prepared statements (or $mysqli->real_escape_string
) on top, like this:
$query = $wpdb->prepare(
'SELECT * FROM table WHERE search_column REGEXP %s',
'[[:<:]]' . escape_regex_for_mysql($dangerous_string)
);
Rationale:
Let's look at MySQL's documentation to see which characters will need to be escaped. The documentation says:
MySQL >= 8.0.4 implements regular expression support using International Components for Unicode (ICU), which provides full Unicode support and is multibyte safe.
So let's look at the documentation for the ICU project:
\ (outside of sets) Quotes the following character. Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ | \ .
\ (inside sets) Quotes the following character. Characters that must be quoted to be treated as literals are [ ] \ Characters that may need to be quoted, depending on the context are - &
So the list of special characters that need to be escaped are * ? + [ ] ( ) { } ^ $ \ . - &
. preg_quote
escapes all of these except &
. It also escapes some characters unnecessarily, but MySQL will handle that as expected.
Alternatively, you can add &
as a second parameter to preg_quote
for it to be escaped, e.g. preg_quote('AT&T', '&')
AFAIK, there is no native way of escaping for MySQL regex. You can do it in PHP with preg_quote (http://www.php.net/manual/en/function.preg-quote.php) which would probably do the job for you, but is obviously not designed for the purpose.
My preferred way if I were in your situation would be to construct a regex whitelist in PHP that you can then apply to your dangerous string:
$safeString = preg_replace('/[^\w]/','',$dangerousString);
This removes any non-word characters (i.e. anything except A-Za-z0-9_) from your string.
NB I believe the other answers given will not remove/escape regex special characters, which I believe is your requirement.
You need to ensure that quotes and ticks are properly handled before passing to the database. The best method for this is:
mysql_real_escape_string ([php doc][1])
This method is available in both PHP, and C++ mysql client libraries.
This should ensure any 'dangerous_string' is no longer dangerous and can be used within a quoted string used by RegEx.
精彩评论