开发者

How can I store captures from a Perl regular expression into separate variables?

开发者 https://www.devze.com 2022-12-20 15:40 出处:网络
I have a regex: /abc(def)ghi(jkl)mno(pqr)/igs How would I capture the results of each parentheses into 3 different variables, one for each parentheses? Right now I using one array to capture all th

I have a regex:

/abc(def)ghi(jkl)mno(pqr)/igs

How would I capture the results of each parentheses into 3 different variables, one for each parentheses? Right now I using one array to capture all the results, t开发者_Python百科hey come out sequential but then I have to parse them and the list could be huge.

@results = ($string =~ /abc(def)ghi(jkl)mno(pqr)/igs);


Your question is a bit ambiguous to me, but I think you want to do something like this:

my (@first, @second, @third);
while( my ($first, $second, $third) = $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
    push @first, $first;
    push @second, $second;
    push @third, $third;
}


Starting with 5.10, you can use named capture buffers as well:

#!/usr/bin/perl

use strict; use warnings;

my %data;

my $s = 'abcdefghijklmnopqr';

if ($s =~ /abc (?<first>def) ghi (?<second>jkl) mno (?<third>pqr)/x ) {
    push @{ $data{$_} }, $+{$_} for keys %+;
}

use Data::Dumper;
print Dumper \%data;

Output:

$VAR1 = {
          'first' => [
                       'def'
                     ],
          'second' => [
                        'jkl'
                      ],
          'third' => [
                       'pqr'
                     ]
        };

For earlier versions, you can use the following which avoids having to add a line for each captured buffer:

#!/usr/bin/perl

use strict; use warnings;

my $s = 'abcdefghijklmnopqr';

my @arrays = \ my(@first, @second, @third);

if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $arrays[$_] }, $captured[$_] for 0 .. $#arrays;
}

use Data::Dumper;
print Dumper @arrays;

Output:

$VAR1 = [
          'def'
        ];
$VAR2 = [
          'jkl'
        ];
$VAR3 = [
          'pqr'
        ];

But I like keeping related data in a single data structure, so it is best to go back to using a hash. This does require an auxiliary array, however:

my %data;
my @keys = qw( first second third );

if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $data{$keys[$_]} }, $captured[$_] for 0 .. $#keys;
}

Or, if the names of the variables really are first, second etc, or if the names of the buffers don't matter but only order does, you can use:

my @data;
if ( my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $data[$_] }, $captured[$_] for 0 .. $#captured;
}


An alternate way of doing it would look like ghostdog74's answer, but using an array that stores hash references:

my @results;
while( $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
    my ($key1, $key2, $key3) = ($1, $2, $3);
    push @results, { 
        key1 => $key1,
        key2 => $key2,
        key3 => $key3,
    };
}

# do something with it

foreach my $result (@results) {
    print "$result->{key1}, $result->{key2}, $result->{key3}\n";
}

with the main advantage here of using a single data structure, AND having a nice readable loop.


@OP, when parenthesis are captured, you can use the variables $1,$2....these are backreferences

$string="zzzabcdefghijklmnopqrsssszzzabcdefghijklmnopqrssss";
while ($string =~ /abc(def)ghi(jkl)mno(pqr)/isg) {
    print "$1 $2 $3\n";
}

output

$ perl perl.pl
def jkl pqr
def jkl pqr


You could have three different regex's each focusing on specific groups. Obviously, you would like to just assign different groups to different arrays in the regex, but I think your only option is to split the regex up.


You can write a regex containing named capture groups. You do this with the ?<myvar> construct at the beginning of the capture group:

/(?<myvar>[0-9]+)/

You may then refer to those named capture groups using a $+{myvar} form.

Here is a contrived example:

perl -ne '/^systemd-(?<myvar>[^:]+)/ && { print $+{myvar} . "\n"}' /etc/passwd

Given a typical password file, it pulls out the systemd users and returns the names less the systemd prefix. It uses a capture group named myvar. This is just an example thrown together to illustrate the use of capture group variables.

0

精彩评论

暂无评论...
验证码 换一张
取 消