开发者

What's the best way to escape user input for Regular Expressions in MySQL?

开发者 https://www.devze.com 2023-01-17 14:16 出处:网络
I\'d like to take user input, 开发者_开发技巧denoted as $dangerous_string, and use it as part of a RegEx in a MySQL query.

I'd like to take user input, 开发者_开发技巧denoted as $dangerous_string, and use it as part of a RegEx in a MySQL query.

What's the best way to go about doing this? I want to use the user's string as a literal -- if it contains any characters that mean something in MySQL RegEx, those characters should not actually affect my Regular Expression.

$dangerous_string = $_GET["string"];
//do something here
$dangerous_string = what_goes_here($dangerous_string);
$sql = "SELECT * FROM table WHERE search_column REGEX '" . $mysqli->real_escape_string("[[:<:]]$dangerous_string") . "'";

//etc....


Use preg_quote and replace & manually

preg_quote takes a string and escapes special characters with a backslash. It is meant for PHP regexes, not MySQL regexes, and it does not escape &, which is needed for MySQL. So we only need to modify it like so:

function escape_regex_for_mysql($dangerous_string) {
    return preg_replace('/&/', '\\&', preg_quote($dangerous_string));
}

Note that you should still use prepared statements (or $mysqli->real_escape_string) on top, like this:

$query = $wpdb->prepare(
    'SELECT * FROM table WHERE search_column REGEXP %s',
    '[[:<:]]' . escape_regex_for_mysql($dangerous_string)
);

Rationale:

Let's look at MySQL's documentation to see which characters will need to be escaped. The documentation says:

MySQL >= 8.0.4 implements regular expression support using International Components for Unicode (ICU), which provides full Unicode support and is multibyte safe.

So let's look at the documentation for the ICU project:

\ (outside of sets) Quotes the following character. Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ | \ .

\ (inside sets) Quotes the following character. Characters that must be quoted to be treated as literals are [ ] \ Characters that may need to be quoted, depending on the context are - &

So the list of special characters that need to be escaped are * ? + [ ] ( ) { } ^ $ \ . - &. preg_quote escapes all of these except &. It also escapes some characters unnecessarily, but MySQL will handle that as expected.

Alternatively, you can add & as a second parameter to preg_quote for it to be escaped, e.g. preg_quote('AT&T', '&')


AFAIK, there is no native way of escaping for MySQL regex. You can do it in PHP with preg_quote (http://www.php.net/manual/en/function.preg-quote.php) which would probably do the job for you, but is obviously not designed for the purpose.

My preferred way if I were in your situation would be to construct a regex whitelist in PHP that you can then apply to your dangerous string:

$safeString = preg_replace('/[^\w]/','',$dangerousString);

This removes any non-word characters (i.e. anything except A-Za-z0-9_) from your string.

NB I believe the other answers given will not remove/escape regex special characters, which I believe is your requirement.


You need to ensure that quotes and ticks are properly handled before passing to the database. The best method for this is:

   mysql_real_escape_string  ([php doc][1])

This method is available in both PHP, and C++ mysql client libraries.

This should ensure any 'dangerous_string' is no longer dangerous and can be used within a quoted string used by RegEx.

0

精彩评论

暂无评论...
验证码 换一张
取 消