开发者

How to return only lines that do not match any values of an array?

开发者 https://www.devze.com 2023-02-15 15:49 出处:网络
I\'m attempting to compare each line in a CSV file to each and every element (strings) I have stored in an array using Perl. I want to return/print-to-file the line from the CSV file only if it is not

I'm attempting to compare each line in a CSV file to each and every element (strings) I have stored in an array using Perl. I want to return/print-to-file the line from the CSV file only if it is not matched by any of the strings in the array. I've tried numerous kinds of loops to achieve this, but have not only not found a solution, but none of my attempts is really giving me clues as to where I'm going wrong. Below are a few samples of the loops I've tried:

while (<CSVFILE>) {
   foreach $i (@lines) {
        print OUTPUTFILE $_ if $_ !~ m/$i/;
     }; #foreach
}; #while

AND:

foreach $i (@lines) {
open (CSVFILE , "< $csv") or die "Can't open $csv for read: $!";
  while (<CSVFILE>) {
    if ($_ !~ m/$i/) {
      print OUTPUTFILE $_;
    }; #if
  }; #while
close (CSVFILE) or die "Cannot close $csv: $!";
}; #foreach

Here is a sample of the CSV file I am attempting:

1,c.03_05delAAG,null,71...
2,c.12T>G,null,24T->G,5...
3,c.87C>T,null,96C->T,82....

And the array elements (with regex escape characters):

c\.12T\>G
c\.97A\>C

Assuming only the above as input data, I would hope to get back:

1,c.03_05delAAG,null,71...
3,c.87C>T,null,16C->T....

since the开发者_如何学Cy do not contain any of the elements from the array. Is this a situation where Hashes come into play? I don't have a great handle on them yet, aside from the standard "dictionary" definition. If anyone could help me get my head around this problem it would be greatly appreciated. A this point I might just do it manually as there isn't that many and I need this out of the way ASAP, but since I wasn't able to find any answers searching anywhere else I figured it was worthwhile asking.


Use Perl 5.10.1 or better, so you can apply smart matching. Also, don't use the implicit $_ when you're dealing with two loops, it gets too confusing and is error prone.

The following code (untested) might do the trick:

use 5.010;
use strict;
use warnings;
use autodie;

...

my @regexes = map { qr{$_} } @lines;

open my $out, '>', $outputfile;
open my $csv, '<', $csvfile;

while (my $line = <$csv>) {
    print $out $line unless $line ~~ @regexes;
}

close $csv;
close $out;

The reason your code doesn't work, by the way, is that it will print a line if any of the elements in @lines don't match, and that will always be the case.

0

精彩评论

暂无评论...
验证码 换一张
取 消