开发者

Can I safely use extended regular expressions all the time rather than basic?

开发者 https://www.devze.com 2022-12-27 01:27 出处:网络
It appears that most modern languages and tools allow for extended regular expressions, and ERE looks a lot cle开发者_如何学JAVAaner than BRE with all those backslashes. Are there any major drawbacks

It appears that most modern languages and tools allow for extended regular expressions, and ERE looks a lot cle开发者_如何学JAVAaner than BRE with all those backslashes. Are there any major drawbacks in compatibility or maintainability when using ERE instead of BRE?


There are clearly drawbacks for compability, but this obviously doesn't affect new regexes you write. I always use extended regexes. In fact, this is the default for most regex libraries these days, so just go with the flow.


Depending on your environment and audience.

BRE's are becoming rarer and I think that more tools probably support ERE than BRE at this point. Even grep (GNU variant) is now ERE default.

Most of what is called 'BRE' is really not BRE. If you have any of these functional as meta characters: | + ? than your BRE is really ERE with BRE escaping.

I think that BRE is the exception and should be avoided unless there is something in your environment that requires it.


I don't think "BRE vs. ERE" is a very useful distinction these days. There are still many tools based on ERE, like awk and gnu grep, as well as the regex support in databases like MySQL and Oracle, but BRE is practically a footnote.

Moreover, the regex flavors built into most modern programming languages go way beyond ERE in terms of features. Even JavaScript, the least powerful of the lot, supports non-capturing groups, reluctant quantifiers, and lookaheads. It would probably be more helpful to classify regex flavors as "ERE vs. ECMA+", but there's a lot more to it than that.

If you're programming in Tcl, you use \y and \m to match word boundaries; in JavaScript you learn to love [\s\S] because there's no dot-matches-newlines mode; in Visual Studio you use @ and # instead of *? and +? for minimal matching. And, although Java has a thoroughly modern regex flavor, it has no regex literals and no raw/literal/verbatim string notation, so you go blind from looking at all the backslashes anyway.

In practice, this isn't really a choice you have to make anyway. Once you've decided which tool to employ, you use whatever regex flavor it demands.


ref: Flavor comparison chart

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号