开发者

How do I push more than one matched groups as same element of array in Perl?

开发者 https://www.devze.com 2022-12-12 16:14 出处:网络
I am need to push all the matched groups into an array. #!/usr/bin/perl use strict; open (FILE, \"/home/user/name\") || die $!;

I am need to push all the matched groups into an array.

#!/usr/bin/perl
use strict; 

open (FILE, "/home/user/name") || die $!;
my @lines = <FILE>;
close (FILE);
open (FH, ">>/home/user/new") || die $!;
foreach $_(@lines){
    if ($_ =~ /AB_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/){
            print FH "$1 $2 $3 $4 $5 $6 $7\n"; #needs to be first element of array
    }
    elsif ($_ =~ /CD_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/){
            print FH "$1 $2 $3 $4 $5 $6\n"; #needs to be second element of array
    }
}close (FH);

_ INPUT _

AB_ first--2-45_ Name_ is34_ correct_ OR_ not_W3478.txt 

CD_ second_ input_ 89-is_ diffErnt_ 76-from_Wfirst6.txt

Instead of writing matched groups to FILE, I want to push them into array. I can't think of any other command other than push but this function does not accept m开发者_JAVA百科ore than one argument. What is the best way to do the same? The output should look like following after pushing matched groups into array.

_ OUTPUT _

$array[0] = first--2-45 Name is34 correct OR not

$array[1] = second input 89-is diffErnt 76-from


Use the same argument for push that you use for print: A string in double quotes.

push @array, "$1 $2 $3 $4 $5 $6 $7";


Take a look at perldoc -f grep, which returns a list of all elements of a list that match some criterion.

And incidentally, push does take more than one argument: see perldoc -f push.

push @matches, grep { /your regex here/ } @lines;

You didn't include the code leading up to this though.. some of it is a little odd, such as the use of $_ as a function call. Are you sure you want to do that?


If you are using Perl 5.10.1 or later, this is how I would write it.

#!/usr/bin/perl
use strict;
use warnings;
use 5.10.1; # or use 5.010;
use autodie;

my @lines = do{
  # don't need to check for errors, because of autodie
  open( my $file, '<', '/home/user/name' );
  grep {chomp} <$file>;
  # $file is automatically closed
};

# use 3 arg form of open
open( my $file, '>>', '/home/user/new' );

my @matches;
for( @lines ){
  if( /(?:AB|CD)( (?:_[^_]+)+ )_W .+ txt/x ){
    my @match = "$1" =~ /_([^_]+)/g;
    say {$file} "@match";
    push @matches, \@match;
    # or
    # push @matches, [ "$1" =~ /_([^_]+)/g ];
    # if you don't need to print it in this loop.
  }
}

close $file;

This is a little bit more permissive of inputs, but the regex should be a little bit more "correct", than the original.


Remember that a capturing match in list context returns the captured fields, if any:

#!/usr/bin/perl

use strict; use warnings;

my $file = '/home/user/name';

open my $in, '<',  $file
    or die "Cannot open '$file': $!";

my @matched;

while ( <$in> ) {
    my @fields;
    if (@fields = /AB_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/
            or @fields = /CD_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/)
    {
        push @matched, "@fields";
    }
}

use Data::Dumper;
print Dumper \@matched;

Of course, you could also do

push @matched, \@fields;

depending on what you intend to do with the matches.


I wonder if using push and giant regexes is really the right way to go.

The OP says he wants lines starting with AB at index 0, and those with CD at index 1. Also, those regexes look like an inside-out split to me.

In the code below I have added some didactic comments that point out why I am doing things differently than the OP and the other solutions offered here.

#!/usr/bin/perl
use strict;
use warnings; # best use warnings too.  strict doesn't catch everything

my $filename = "/home/user/name";

# Using 3 argument open addresses some security issues with 2 arg open.
# Lexical filehandles are better than global filehandles, they prevent
#   most accidental filehandle name colisions, among other advantages.
# Low precedence or operator helps prevent incorrect binding of die 
#   with open's args
# Expanded error message is more helpful
open( my $inh, '<', $filename ) 
    or die "Error opening input file '$filename': $!";

my @file_data;

# Process file with a while loop.
# This is VERY important when dealing with large files.
# for will read the whole file into RAM.
# for/foreach is fine for small files.
while( my $line = <$inh> ) {
    chmop $line;

    # Simple regex captures the data type indicator and the data.
    if( $line =~ /(AB|CD)_(.*)_W.+txt/ ) {

        # Based on the type indicator we set variables 
        # used for validation and data access.

        my( $index, $required_fields ) = $1 eq 'AB' ? ( 0, 7 )
                                       : $1 eq 'CD' ? ( 1, 6 )
                                       : ();
        next unless defined $index;

        # Why use a complex regex when a simple split will do the same job?
        my @matches = split /_/, $2;

        # Here we validate the field count, since split won't check that for us.
        unless( @matches == $required_fields ) {
            warn "Incorrect field count found in line '$line'\n";
            next;
        }        

        # Warn if we have already seen a line with the same data type.
        if( defined $file_data[$index] ) {
            warn "Overwriting data at index $index: '@{$file[$index]}'\n";
        }

        # Store the data at the appropriate index.
        $file_data[$index] = \@matches;
    }
    else {
        warn "Found non-conformant line: $line\n";
    }

}

Be forewarned, I just typed this into the browser window. So, while the code should be correct, there may be typos or missed semicolons lurking--it's untested, use it at your own peril.

0

精彩评论

暂无评论...
验证码 换一张
取 消