开发者

Why does this regular expression suppress more than I was expecting?

开发者 https://www.devze.com 2023-02-07 11:50 出处:网络
I am trying to suppress strings that begin with [T without doing a positive match and negating the results.

I am trying to suppress strings that begin with [T without doing a positive match and negating the results.

my @tests = ("OT", "[T","NOT EXCLUDED");
foreach my $test (@tests)
{
 #match from start of string, 
 #include 'Not left sq bracket开发者_JS百科' then include 'Not capital T'
 if ($test =~ /^[^\[][^T]/)  #equivalent to /^[^\x5B][^T]/
 {
  print $test,"\n";
 }
}

Outputs

NOT EXCLUDED

My question is, can somebody tell me why OT is being excluded in the above example?

EDIT Thanks for your replies so far everybody, I can see I was being a bit stoopid.


The regex ^[^\[][^T] matches string that begin with a character other than [ followed by a character other than T.

Since OT has T as 2nd character, it is not matched.

If you want to match any string other than those that begin with [T, you can do:

if ($test =~ /^(?!\[T)/) {
   print $test,"\n";
}


YAPE::Regex::Explain can be helpful:

$ perl -MYAPE::Regex::Explain -E 'say YAPE::Regex::Explain->new(qr/^[^\[][^T]/)->explain'
The regular expression:

(?-imsx:^[^\[][^T])

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  [^\[]                    any character except: '\['
----------------------------------------------------------------------
  [^T]                     any character except: 'T'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------


Your regex translates to:

From start of the input, match anything but an open square bracket ([) followed by anything but a capital T

  • OT fails to match
  • [T as well


Your expression is equivalent to "begins with NOT [ and second one is NOT T", so the only one that passes is NOT EXCLUDED, because in OT, the second letter is T

0

精彩评论

暂无评论...
验证码 换一张
取 消