开发者

Perl: replace pattern from the current position until the end of a line

开发者 https://www.devze.com 2022-12-12 01:01 出处:网络
In Perl, how can I replace a pattern from the current position (the position of the last replacement) until the end of a line?

In Perl, how can I replace a pattern from the current position (the position of the last replacement) until the end of a line?

I have done all of these replacements in a single line:

...
s/\[//;
s/(\/\w\w\w\/)/ getM开发者_高级运维onth $1 /e;
s/:/ /;
s/\s\+\d\d\d\d\]//;
#NOW: replace all blanks with a plus sign from this position until the end of this line.


I see you have accepted an answer. However, for the task at hand, it would have been more appropriate to use Apache::ParseLog or maybe Apache::LogRegex:

Apache::LogRegex - Parse a line from an Apache logfile into a hash

It looks to me like you are trying to write a log file analyzer from scratch and this is your way of grouping log file entries by month. If that is the case, please stop re-inventing square wheels.

Even if you do not want to use external modules, you can simplify the task by dividing and conquering using split:

#!/usr/bin/perl

use strict; use warnings;
use Carp;
use Regex::PreSuf;

my @months = qw(jan feb mar apr may jun jul aug sep oct nov dec);
my %months = map { $months[$_] => sprintf '%02d', $_ + 1 } 0 .. 11;
my $months_re = presuf( @months );

# wrapped for formatting, does not make any difference
my $str = q{62.174.188.166 - - [01/Mar/2003:00:00:00 +0100] "GET
/puntos/img/ganar.gif HTTP/1.1" 200 1551
"http://www.universia.com/puntos/index.jsp";
"Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt; Hotbar 2.0)"};

chomp($str);

my @parts = split qr{\s\[|\]\s}, $str;

if ( $parts[1] =~ m! / ($months_re) / !ix ) {
    $parts[1] = $1;
}

$parts[2] =~ s/\s/+/g;

print join(' ', @parts), "\n";

Output:

62.174.188.166 - - Mar "GET+/puntos/img/ganar.gif+HTTP/1.1"+200+1551+"http://www .universia.com/puntos/index.jsp";+"Mozilla/4.0+(compatible;+MSIE+5.0;+Windows+98 ;+DigExt;+Hotbar+2.0)"


From your language, you seem to be imagining your sequence of substitutions are working forward through the string, each substitution taking up where the last one left off. In fact, each substitution will apply to the entire string.

When you say "the position of the last replacement", what should happen if the previous substitution found nothing?

In a script, you can just do:

if ( s/\s\+\d\d\d\d\]// ) { $' =~ s/ /+/g }

but use of $' should be avoided in reusable code, since it can impact performance of other regular expressions. There, you'd need to do

if ( s/\s\+\d\d\d\d\]// ) { substr($_, $+[0]) =~ s/ /+/g }

but in either case, you need to make sure that the match or substitution you expect to have set $' or @+ actually succeeded.


Since Perl 5.6, the position at the end of the last match is stored in the @+ array. The position at the end of the entire match is $+[0].

You can use this to split the string in two parts, and do a replacement on only the later part:

my $base = " pears apples bananas coconuts ";
$base =~ s/apples/oranges/;
my $firstpart = substr($base, 0, $+[0]);
my $secondpart = substr($base, $+[0]); 
$secondpart =~ s/ /\+/g;
print '"' . $firstpart . $secondpart . "\"\n";

Which will print:

" pears oranges+bananas+coconuts+"

One problem with this approach is that $+[0] contains the position before the replacement. So perhaps there is a better way :)

0

精彩评论

暂无评论...
验证码 换一张
取 消