开发者

Perl regex find and replace

开发者 https://www.devze.com 2023-03-25 12:33 出处:网络
I\'m new to perl and I\'m trying to figure out a find and replace. I have a large csv file (actually semi-colon separated). Some of the numbers (int and decimals) in the file have a negat开发者_StackO

I'm new to perl and I'm trying to figure out a find and replace. I have a large csv file (actually semi-colon separated). Some of the numbers (int and decimals) in the file have a negat开发者_StackOverflowive symbol after the number. I need to move the negative sign to before the number.

E.g: Change

ABC;10.00-;XYZ

to

ABC;-10.00;XYZ

I'm not sure how to do this in perl. Can someone please help?

Regards, Anand


I would not dabble around in a large csv file with regexes, unless I was very sure about my data and the regex. Using a CSV module seems to me to be the safest way.

This script will take input files as arguments, and write the corrected files with a .new extension.

If you notice undesired changes in your output file, you can try to un-comment the keep_meta_info line.

use strict;
use warnings;
use autodie;
use Text::CSV;

my $out_ext = ".new";
my $csv = Text::CSV->new( { 
        sep_char => ";",
        #   keep_meta_info => 1,
        binary => 1,
        eol => $/,
    } ) or die "" . Text::CSV->error_diag();

for my $arg (@ARGV) {
    open my $input, '<', $arg;
    open my $output, '>', $arg . $out_ext;
    while (my $row = $csv->getline($input)) {
        for (@$row) {
            s/([0-9\.]+)\-$/-$1/;
        }
        $csv->print($output, $row);
    }
}


I'll assume you don't have to worry about quoteing or escaping in your delimited file. I'll read from standard in/out, change to appropriate files if req'd

while( my $line = <STDIN> )
{
    chop( $line );
    my @rec = split( ';', $line );
    map( s/^(\d*\.?\d+)\-$/-$1/, @rec );
    print join(';',@rec) . "\n";
}

If you do have to worry about escaping and quoting, then use Text::CSV_XS instead of the <STDIN>, split, and join oprerations


In general, the replace command is s/old/new/flags:

s/(           # start a capture group
    \d+       # first part of the number
    (\.\d+)?  # possibly a decimal dot and the fractional part
  )-          # end capture group, match the minus sign
 /-$1/gx      # move minus to the front

The g flag means “global” (replace all occurences), and x is “extended legibility” (allows whitespace and comments in the pattern). You have to test the expression on your data to see what corner cases you might have missed, it usually takes a few iterations to get the right one. Samples:

$ echo "10.5-;10-;0-;a-" | perl -pe 's/(\d+(\.\d+)?)-/-$1/g'
-10.5;-10;-0;a-

See also perldoc perlop (search for “replacement” to jump to the right section).

0

精彩评论

暂无评论...
验证码 换一张
取 消