开发者

Perl Regex Explanation

开发者 https://www.devze.com 2023-02-28 06:39 出处:网络
Hey So I am not good with regex right now, trying to learn though, can someone explain this 开发者_StackOverflowone out for me bit by bit?

Hey So I am not good with regex right now, trying to learn though, can someone explain this 开发者_StackOverflowone out for me bit by bit?

if ($fileStrings[$stringCount] =~ m/((?:include|require)(?:_once)?\s*\(.*?\$.*?\);)/gi)

Thanks


m/((?:include|require)(?:_once)?\s*(.?\$.?);)/gi

m match for

/ pattern delimiter

(?:include|require) match but not capture 'include' or 'require'

(?:_once)? optionally match for but not capture '_once'

\s* 0 or more spaces or tabs, other "whitespace" characters

(.?\$.?) match and capture 0 or 1 of any character, followed by literal $ character, followed by 0 or 1 of any character

; match for semicolon

(...) outer parenthesis - capture whole thing

/ pattern delimiter

gi global, case-insensitive search


I usually find it easy to write a test program to check my thoughts. Maybe this will help you understand what the regex is doing:

#! /usr/bin/env perl

use warnings;
use strict;
use feature qw(say);

for my $line (
    'include_once  F$G;',
    'require_once  F$G;',
    'INCLUDE  F$G;',
    'include_once      AF$G;',
    'include_once  F$G;',
) {
    if ($line =~ m/((?:include|require)(?:_once)?\s*(.?\$.?);)/gi) {
        say qq(Line = "$line");
        say qq(\$1 = "$1");
        say qq(\$2 = "$2"\n);
    }
    else {
        say qq(Line = "$line");
        say "No match!\n";
    }
}

And the output is:

Line = "include_once  F$G;"
$1 = "include_once  F$G;"
$2 = "F$G"

Line = "require_once  F$G;"
$1 = "require_once  F$G;"
$2 = "F$G"

Line = "INCLUDE  F$G;"
$1 = "INCLUDE  F$G;"
$2 = "F$G"

Line = "include_once      AF$G;"
No match!

Line = "include_once  F$G;"
$1 = "include_once  F$G;"
$2 = "F$G"

The parentheses are used to capture parts of the regular expression is the variables $1, $2, $3, etc. The ?: doesn't allow the parentheses to capture that part (thus, you have $2 instead of $4 with the value). However, the outer parentheses capture the entire line despite the ?:.

It looks like the g parameter at the end allows for multiple lines to be captured. However, that didn't work in my tests.

0

精彩评论

暂无评论...
验证码 换一张
取 消