开发者

Perl: Parse maillog to get date/recipient in a single regex statement

开发者 https://www.devze.com 2023-03-12 13:15 出处:网络
I\'m trying to parse my maillog, which contains a number of lines which look similar to the following line:

I'm trying to parse my maillog, which contains a number of lines which look similar to the following line:

Jun  6 17:52:06 host sendmail[30794]: p569q3sX030792: to=<person@recipient.com>, ctladdr=<apache@host.com> (48/48), delay=00:00:03, xdelay=开发者_StackOverflow00:00:03, mailer=esmtp, pri=121354, relay=gmail-smtp-in.l.google.com. [1.2.3.4], dsn=2.0.0, stat=Sent (OK 1307354043 x8si28599066ict.63)

The rules I'm trying to apply are:

  • The date is always the first 2 words
  • The email address always occurs between " to=person@recipient.com, " however the email address might be surrounded by <>

There are some lines in the log which do not relate to a recipient, so I'd like to ignore those lines entirely.

The following code works for either rule individually, however I'm having trouble combining them:

if($_ =~ m/\ to=([<>a-zA-Z0-9\.\@]*),\ /g) {
  print "$1\n";
}

if($_ =~ /^+(\S+\s+\S+\s)/g) {
  print "$1\n";
}

As always, I'm not sure whether the regex I'm using above is "best practice" so feel free to point out anything I'm doing badly there too :)

Thanks!


print substr($_, 0, 7), "$1\n" if / to=(.+?), /;

Your date is in a fixed-length format, you don't need a regular expression to match it.
For the address, what you need is the part between to= and the next ,, so a non-greedy match is just what you need.


To match either with one regex, or them using syntax (regex1|regex2) together:

((?<\ to=)[<>a-zA-Z0-9\.\@]*(?=,\ )|^\S+\s+\S+\s) 

The outer brackets preserve $1 being assigned the match.

The look behind (?<\ to=) and look ahead (?=,\ ) do not capture anything, so these regexes only capture your target string.

0

精彩评论

暂无评论...
验证码 换一张
取 消