regex decompiler_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-03-08 19:45 出处：网络

I found this regex and开发者_Go百科 want to understand it. Are there any regex decompilers that will translate what the following regex does into words? It is really complicated.

相关专题：perl regex

I found this regex and开发者_Go百科 want to understand it. Are there any regex decompilers that will translate what the following regex does into words? It is really complicated.

$text =~ /(((\w)\W*(?{$^R.(0+( q{a}lt$3))})) {8}(?{print +pack"B8" ,$^Rand ""})) +/x;

Using YAPE::Regex::Explain (not sure if it is good, but it's the first result in searching):

use YAPE::Regex::Explain;
my $REx = qr/(((\w)\W*(?{$^R.(0+( q{a}lt$3))})) {8}(?{print +pack"B8" ,$^Rand ""})) +/x;
my $exp = YAPE::Regex::Explain->new($REx)->explain;
print $exp;

I've got the explanation as:

  (                        group and capture to \1 (1 or more times
                           (matching the most amount possible)):
----------------------------------------------------------------------
    (                        group and capture to \2 (8 times):
----------------------------------------------------------------------
      (                        group and capture to \3:
----------------------------------------------------------------------
        \w                       word characters (a-z, A-Z, 0-9, _)
----------------------------------------------------------------------
      )                        end of \3
----------------------------------------------------------------------
      \W*                      non-word characters (all but a-z, A-Z,
                               0-9, _) (0 or more times (matching the
                               most amount possible))
----------------------------------------------------------------------
      (?{$^R.(0+(              run this block of Perl code
      q{a}lt$3))})
----------------------------------------------------------------------
    ){8}                     end of \2 (NOTE: because you are using a
                             quantifier on this capture, only the
                             LAST repetition of the captured pattern
                             will be stored in \2)
----------------------------------------------------------------------
    (?{print +pack"B8"       run this block of Perl code
    ,$^Rand ""})
----------------------------------------------------------------------
  )+                       end of \1 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \1)

There are 2 blocks of Perl code, which must be analyzed independently.

In the first block:

$^R . (0 + (q{a} lt $3))

here, $^R is "the result of evaluation of the last successful (?{ code }) regular expression assertion", and the expression (0 + (q{a} lt $3)) gives 1 if the 3rd capture is in [b-z], 0 otherwise.

In the second block:

print +pack "B8", $^R and ""

it interpret the previous result of evaluation as a (big-endian) binary string, get the number, convert it to the corresponding character, and finally print it out.

Together, the regex finds every 8 alphanumeric characters, then treat those in [b-z] as the binary digit 1, otherwise 0. These 8 binary digits are then interpreted as a character code, and that character is printed out.

For instance, the letter 'H' = 0b01001000 would be printed when matching the string