开发者

How to break queries using regex in php

开发者 https://www.devze.com 2023-01-21 19:49 出处:网络
Suppose I have the following string: insert into table values (\'text1;\'); insert into table values (\'text2\')

Suppose I have the following string:

insert into table values ('text1;'); insert into table values ('text2')

How do I break those queries (get each individual query) using regular expressions?

I've found a very similar problem: Use regex to find specific string not in html tag ...but it uses a solution that is specific to开发者_StackOverflow .NET: behind lookup (in php it complains that is not fixed length).

I would be very grateful if someone could give me some hints on how to deal with this problem.


The trick is to count how many unescaped quote characters you've passed. Assuming that the SQL is syntactically correct, semicolons after an even number of unescaped quote characters will be the ones you want, and semicolons after an odd number of unescaped quote characters will be part of a string literal. (Remember, string literals can contain properly escaped quote characters.)

If you want 100% reliability, you'll need a real SQL parser, like this. (I just Googled "SQL parser in PHP". I don't know if it works or not.)


EDIT:

I don't think it's possible to find pairs of unescaped quote characters using nothing but regex. Maybe a regex guru will prove me wrong, but it just seems too damn difficult to distinguish between escaped and unescaped quote characters in so many possible combinations. I tried look-behind assertions and backrefs with no success.

The following is not a pure-regex solution, but I think it works:

preg_match_all("/(?:([^']*'){2})*[^']*;/U", str_replace("\\'", "\0\1\2", $input), $matches);
$output = array_map(function($str){ return str_replace("\0\1\2", "\\'", $str); }, $matches[0]);

Basically, we temporarily replace escaped quote characters with a string of bytes that is extremely unlikely to occur, in this case \0\1\2. After that, all the quote characters that remain are the unescaped ones. The regex picks out semicolons preceded by an even number of quote characters. Then we restore the escaped quote characters. (I used a closure there, so it's PHP 5.3 only.)

If you don't need to deal with quote characters inside string literals, yes, you can easily do it with pure regex.


Assuming proper SQL syntax it would probably be best to split on the semicolon.

The following regexp will work but only if all quotes come in pairs.

/.+?\'.+?\'.*?;|.+?;/

To avoid escaped single quotes:

/.+?[^\\\\]\'.+?[^\\\\]\'.*?;|.+?;/

To handle Multiple pairs of single quotes.

/.+?(?:[^\\]\'.+?[^\\]\')+.*?;|.+?;/

Tested against the following data set:

insert into table values ('text1;\' ','2'); insert into table values ('text2'); insert into test3 value ('cookie\'','fly');

Returns:

insert into table values ('text1;\' ','2');

insert into table values ('text2');

insert into test3 value ('cookie\'','fly');

I have to admit this is a pretty dirty regexp. It would not handle any sort of SQL syntax errors at all. I enjoyed the challenge of coming up with a pure regex though.


How you want to break?

You can use explode( ' ', $query ) to transform into an array.

Or if you want to get text1 and text2 values with regexp you can use preg_match( '/(\'([\w]+)\')/', $query, $matches ) where $matches[1] is your value.

preg_match_all( '/([\w ]+([\w \';]+))/', $queries, $matches ) will give to you all matches with this pattern of query.


Regex's aren't always good at this type of thing. The following function should work though:

function splitQuery($query) {
    $open = false;
    $buffer = null;
    $parts = array();
    for($i = 0, $l = strlen($query); $i < $l; $i++) {
        if ($query[$i] == ';' && !$open) {
            $parts[] = trim($buffer);
            $buffer = null;
            continue;
        }
        if ($query[$i] == "'") {
            $open = ($open) ? false: true;
        }

        $buffer .= $query[$i];
    }

    if ($buffer) $parts[] = trim($buffer);
    return $parts;
}

Usage:

$str = "insert into table values ('text1;'); insert into table values ('text2')";
$str = splitQuery($str);
print_r($str);

Outputs:

Array
(
    [0] => insert into table values ('text1;')
    [1] => insert into table values ('text2')
)
0

精彩评论

暂无评论...
验证码 换一张
取 消