开发者

How to convert a PCRE to a POSIX RE?

开发者 https://www.devze.com 2022-12-30 21:57 出处:网络
This interesting question Regex to match anything (including the empty string) except a specific given string concerned how to do a negative look-ahead in MySQL.The poster wanted to get the effect of

This interesting question Regex to match anything (including the empty string) except a specific given string concerned how to do a negative look-ahead in MySQL. The poster wanted to get the effect of

Kansas(?! State)

because MySQL doesn't implement look-ahead assertions, a number of answers came up the equivalent

Kansas($|[^ ]| ($|[^S])| S($|[^t])| St($|[^a])| Sta($|[^t])| Stat($|[^e]))

The poster pointed out that's a PITA to do for potentially lots of expressions.

Is there a script/utility/mode of PCRE (or some other package) that will convert a PCRE (if possible) to an equivalent regex that doesn't use Perl's snazzy features? I'm fully aware that some Perl-style regexes cannot 开发者_运维知识库be stated as an ordinary regex, so I would not expect the tool to do the impossible, of course!


You don't want to do this. It isn't actually mindbogglingly difficult to translate the advanced features to basic features - it's just another flavor of compiler, and compiler writers are pretty clever people - but most of the things that the snazzy features solve are (a) impossible to do with a standard regex because they recognize non-regular languages, so you'd have to approximate them so that at least they work for a limited-length text or (b) possible, but only with a regex of exponential size. And 'exponential' is compsci-speak for "don't go there". You will get swamped in OutOfMemory errors and seemingly-infinite loops if you try to use an exponential solution on anything you would actually want to process.

In other words, Abandon all hope, ye who enter here. It is virtually always better to let the regex do what it's good at and do the rest with other tools. Even such a simple thing as inverting a regex is much, much easier solved with the original regex in combination with the negation operator than with the monstrosity that would result from an accurate regex inverter.

0

精彩评论

暂无评论...
验证码 换一张
取 消