开发者

Is this regular expression exponential?

开发者 https://www.devze.com 2023-01-11 20:58 出处:网络
I would like to know if: /.*(Set-Cookie: (.*))?;.*(<\\?xml.*)/ is an e开发者_JS百科xponential regexp.

I would like to know if:

/.*(Set-Cookie: (.*))?;.*(<\?xml.*)/

is an e开发者_JS百科xponential regexp.

Thanks


It really depends on the regex engine, but in most engines, that pattern is probably polynomial of a high degree (maybe cubic or higher) when there's no match.

You can use e.g. RegexBuddy to see how many steps it takes to match, and more importantly, to not match certain input. You can use this to benchmark how complex the backtracking process may be in certain engines.

It's not clear exactly what you are trying to do, but that pattern really doesn't do much with the Set-Cookie subpattern allowed to be optional (e.g. the group may not match that string even if it exists, since it's optional to begin with).

If you are trying to parse XML, then please, please, please do not use regular expressions. There are many XML parsers available in most modern languages, and they would not only be appropriate for the job, but they'd also be correct and much more pleasant to work with than regex.

References

  • regular-expressions.info/Catastrophic Backtracking

Related questions

  • Detect if a regexp is exponential

The pattern, debunked

To point out why that pattern doesn't "work" (which would render irrelevant whether or not it's fast or slow), consider the following input:

Set-Cookie: NOMNOMNOM;<?xml

With the pattern /.*(Set-Cookie: (.*))?;.*(<\?xml.*)/, the entire string is a match, but group 1 doesn't capture Set-Cookie: NOMNOMNOM, and group 2 doesn't capture NOMNOMNOM (as seen on rubular.com). That's because the leading .* gobbled up the cookie, and since the cookie subpattern is optional, it's still a match anyway.

We can try to "fix" this by making the leading .* reluctant as .*?. Now, group 1 can match Set-Cookie, which is perhaps the intent all along (as seen on rubular.com).

However, this is hardly an improvement. You really do not want to go down this direction. There are still many problems with the regex, and just getting it to work right will be very difficult, if not nearly impossible.

It should be noted that the pattern as given does match ";<?xml" (as seen on rubular.com). That is, as long as there's a ; anywhere in the string, and then later a <?xml, the pattern will match. It's not clear if this pattern really does anything useful.

Related questions

  • Difference between .*? and .* for regex
0

精彩评论

暂无评论...
验证码 换一张
取 消