I am need to push all the matched groups into an array.
#!/usr/bin/perl
use strict;
open (FILE, "/home/user/name") || die $!;
my @lines = <FILE>;
close (FILE);
open (FH, ">>/home/user/new") || die $!;
foreach $_(@lines){
if ($_ =~ /AB_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/){
print FH "$1 $2 $3 $4 $5 $6 $7\n"; #needs to be first element of array
}
elsif ($_ =~ /CD_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/){
print FH "$1 $2 $3 $4 $5 $6\n"; #needs to be second element of array
}
}close (FH);
_ INPUT _
AB_ first--2-45_ Name_ is34_ correct_ OR_ not_W3478.txt
CD_ second_ input_ 89-is_ diffErnt_ 76-from_Wfirst6.txt
Instead of writing matched groups to FILE, I want to push them into array. I can't think of any other command other than push
but this function does not accept m开发者_JAVA百科ore than one argument. What is the best way to do the same? The output should look like following after pushing matched groups into array.
_ OUTPUT _
$array[0] = first--2-45 Name is34 correct OR not
$array[1] = second input 89-is diffErnt 76-from
Use the same argument for push
that you use for print
: A string in double quotes.
push @array, "$1 $2 $3 $4 $5 $6 $7";
Take a look at perldoc -f grep, which returns a list of all elements of a list that match some criterion.
And incidentally, push
does take more than one argument: see perldoc -f push.
push @matches, grep { /your regex here/ } @lines;
You didn't include the code leading up to this though.. some of it is a little odd, such as the use of $_
as a function call. Are you sure you want to do that?
If you are using Perl 5.10.1 or later, this is how I would write it.
#!/usr/bin/perl
use strict;
use warnings;
use 5.10.1; # or use 5.010;
use autodie;
my @lines = do{
# don't need to check for errors, because of autodie
open( my $file, '<', '/home/user/name' );
grep {chomp} <$file>;
# $file is automatically closed
};
# use 3 arg form of open
open( my $file, '>>', '/home/user/new' );
my @matches;
for( @lines ){
if( /(?:AB|CD)( (?:_[^_]+)+ )_W .+ txt/x ){
my @match = "$1" =~ /_([^_]+)/g;
say {$file} "@match";
push @matches, \@match;
# or
# push @matches, [ "$1" =~ /_([^_]+)/g ];
# if you don't need to print it in this loop.
}
}
close $file;
This is a little bit more permissive of inputs, but the regex should be a little bit more "correct", than the original.
Remember that a capturing match in list context returns the captured fields, if any:
#!/usr/bin/perl
use strict; use warnings;
my $file = '/home/user/name';
open my $in, '<', $file
or die "Cannot open '$file': $!";
my @matched;
while ( <$in> ) {
my @fields;
if (@fields = /AB_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/
or @fields = /CD_(.+)_(.+)_(.+)_(.+)_(.+)_(.+)_W.+txt/)
{
push @matched, "@fields";
}
}
use Data::Dumper;
print Dumper \@matched;
Of course, you could also do
push @matched, \@fields;
depending on what you intend to do with the matches.
I wonder if using push
and giant regexes is really the right way to go.
The OP says he wants lines starting with AB at index 0, and those with CD at index 1. Also, those regexes look like an inside-out split to me.
In the code below I have added some didactic comments that point out why I am doing things differently than the OP and the other solutions offered here.
#!/usr/bin/perl
use strict;
use warnings; # best use warnings too. strict doesn't catch everything
my $filename = "/home/user/name";
# Using 3 argument open addresses some security issues with 2 arg open.
# Lexical filehandles are better than global filehandles, they prevent
# most accidental filehandle name colisions, among other advantages.
# Low precedence or operator helps prevent incorrect binding of die
# with open's args
# Expanded error message is more helpful
open( my $inh, '<', $filename )
or die "Error opening input file '$filename': $!";
my @file_data;
# Process file with a while loop.
# This is VERY important when dealing with large files.
# for will read the whole file into RAM.
# for/foreach is fine for small files.
while( my $line = <$inh> ) {
chmop $line;
# Simple regex captures the data type indicator and the data.
if( $line =~ /(AB|CD)_(.*)_W.+txt/ ) {
# Based on the type indicator we set variables
# used for validation and data access.
my( $index, $required_fields ) = $1 eq 'AB' ? ( 0, 7 )
: $1 eq 'CD' ? ( 1, 6 )
: ();
next unless defined $index;
# Why use a complex regex when a simple split will do the same job?
my @matches = split /_/, $2;
# Here we validate the field count, since split won't check that for us.
unless( @matches == $required_fields ) {
warn "Incorrect field count found in line '$line'\n";
next;
}
# Warn if we have already seen a line with the same data type.
if( defined $file_data[$index] ) {
warn "Overwriting data at index $index: '@{$file[$index]}'\n";
}
# Store the data at the appropriate index.
$file_data[$index] = \@matches;
}
else {
warn "Found non-conformant line: $line\n";
}
}
Be forewarned, I just typed this into the browser window. So, while the code should be correct, there may be typos or missed semicolons lurking--it's untested, use it at your own peril.
精彩评论