开发者

Need to print the last occurrence of a string in Perl

开发者 https://www.devze.com 2023-01-11 09:08 出处:网络
I have a script in Perl that searches for an error that is in a config file, but it prints out any occurrence of the error.I need to match what is in the config file and print out only the last time t

I have a script in Perl that searches for an error that is in a config file, but it prints out any occurrence of the error. I need to match what is in the config file and print out only the last time the error occurred. Any ideas?


Wow...I was not expecting this much of a response. I should've been more clear in stating this is for 开发者_开发问答log monitoring on a windows box that sends an alert to Nagios. This is actually my first Perl program and all this information has been very helpful. Does anyone know how I can apply this any of the tail answers on a wintel box?


Another way to do it:

perl -n -e '$e = $1 if /(REGEX_HERE)/;  END{ print $e }' CONFIG_FILE_HERE


What exactly do you need to print? The line containing the error? More context than that? File::ReadBackwards can be helpful.


In outline:

my $errinfo;
while (<>)
{
    $errinfo = "whatever" if (m/the error pattern/);
}
print "error: $errinfo\n" if ($errinfo);

This catches all errors, but doesn't print until the end, when only the last one survives.


A brute-force approach involves setting up your own pipeline by pointing STDOUT to tail. This allows you to print all errors, and then it's up to tail to worry about only letting the last one out.

You didn't specify, so I assume a legal config line is of the form

Name = some value

Matching that is straightforward:

  • ^ (starting at the beginning of line)
  • \w+ (one or more “word characters”)
  • \s+ (followed by mandatory whitespace)
  • = (followed by an equals sign)
  • \s+ (more mandatory whitespace)
  • .+ (some mandatory value)
  • $ (finishing at the end of the line)

Gluing it together, we get

#! /usr/bin/perl

use warnings;
use strict;

# for demo only
*ARGV = *DATA;

my $pid = open STDOUT, "|-", "tail", "-1" or die "$0: open: $!";
while (<>) {
  print unless /^ \w+ \s+ = \s+ .+ $/x;
}

close STDOUT or warn "$0: close: $!";

__DATA__
This = assignment is ok
But := not this
And == definitely not this

Output:

$ ./lasterr 
And == definitely not this

With regular expressions, when you want the last occurrence of a pattern, place ^.* at the front of your pattern. For example, to replace the last X in the input with Y, use

$ echo XABCXXXQQQXX | perl -pe 's/^(.*)X/$1Y/'
XABCXXXQQQXY

Note that the ^ is redundant because regular-expression quantifiers are greedy, but I like having it there for emphasis.

Applying this technique to your problem, you can search for the last line in your config file that contains an error as in the following program:

#! /usr/bin/perl

use warnings;
use strict;

local $_ = do { local $/; scalar <DATA> };
if (/\A.* ^(?! \w+ \s+ = \s+ [^\r\n]+ $) (.+?)$/smx) {
  print $1, "\n";
}

__DATA__
This = assignment is ok
But := not this
And == definitely not this

The syntax of the regular expression is a bit different because $_ contains multiple lines, but the principle is the same. \A is similar to ^, but it matches only at the beginning of string to be searched. With the /m switch (“multi-line”), ^ matches at logical line boundaries.

Up to this point, we know the pattern

/\A.* ^ .../

matches the last line that looks like something. The negative look-ahead assertion (?!...) looks for a line that is not a legal config line. Ordinarily . matches any character except newline, but the /s switch (“single line”) lifts this restriction. Specifying [^\r\n]+, that is, one or more characters that are neither carriage return nor line feed, does not allow the match to spill into the next line.

Look-around assertions do not capture, so we grab the offending line with (.+?)$. The reason it's safe to use . in this context is because we know the current line is bad and the non-greedy quantifier +? stops matching as soon as it can, which in this case is the end of the current logical line.

All these regular expressions use the /x switch (“extended mode”) to allow extra whitespace: the aim is to improve readability.

0

精彩评论

暂无评论...
验证码 换一张
取 消