I want to accept an arbitrary regular expression from the user and anchor it on both sides in order to enforce a full match (^<user's-regex>$
) however I don't know if I have to take into account the fact that the user may have already anchored his regex.
It looks like Perl, C++, .NET and JavaScript all allow double multiple anchoring.
"hello" =~ /^h/ # true
"hello" =~ /^^h/ # true
"hello" =~ /^^^h/ # true
"hello" =~ /e/ # true
"hello" =~ /^e/ # false
"hello" =~ /^^e/ # false
Does anyone know if this is specified to work this way? Can I depend on this behaviour or is it an accident that is liable to change in the future?
Edit: The reason we need this is that we're using VBScript's regex's (from COM), we're using match
however this returns all matches so it开发者_JAVA百科's much slower to match the string abc
to .*a.*
than to ^.*a.*$
. By using the anchoring as suggested by @Tim we speed matches up (for long strings) by more than a factor of 12.
You can depend on this behavior. The regex engine doesn't mind asserting the same thing once, twice, or a hundred times in a row.
However, instead of simply adding anchors around the regex, you should also add a non-capturing group around it:
^(?:
- user regex - )$
or preferably, if your regex flavor allows this: \A(?:
- user regex - )\Z
Otherwise, you'll trip up if the user uses alternation in his regex. Compare:
user regex: hello|bye
anchored regex: ^hello|bye$ // alternation now affects anchors
correctly anchored: ^(?:hello|bye)$
精彩评论