I want to be able to utilize a 'grep' or 'pcregrep -M' like solution that parses a log file that fits the following parameters:
- Each log entry can be multiple lines in length
- First line of log entry has the key that I want to search for
- Each key appears on more then one line
So in the example below I would want to return every line that has KEY1 on it and all the supporting lines below it until the next log message.
Log file: 01 Feb 2010 - 10:39:01.755, DEBUG - KEY1:randomtext blah blah2 T blah3 T blah4 F blah5 F blah6 blah7 01 Feb 2010 - 10:39:01.757, DEBUG - KEY1:somethngelse 01 Feb 2010 - 10:39:01.758, DEBUG - KEY2:randomtest this is a test 01 Feb 2010 - 10:39:01.760, DEBUG - KEY1:more logs here 01 Feb 2010 - 10:39:01.762, DEBUG - KEY1:eve more here this is another multiline log entry keeps on going but not as long as before 01 Feb 2010 - 10:39:01.763, DEBUG - KEY2:testing test test test end of key2 01 Feb 2010 - 10:39:01.762, DEBUG - KEY1:but key 1 is still going and going and going and going and going and going and going and going and going and going and going and going and going okay enough 01 Feb 2010 - 10:39:01.762, DEBUG - KEY3:and so on and on
Desired output of searching for KEY1: 01 Feb 2010 - 10:39:01.755, DEBUG - KEY1:randomtext blah blah2 T blah3 T blah4 F blah5 F blah6 blah7 01 Feb 2010 - 10:39:01.757, DEBUG - KEY1:somethngelse 01 Feb 2010 - 10:39:01.760, DEBUG - KEY1:more logs here 01 Feb 2010 - 10:39:01.762, DEBUG - KEY1:eve more here this is another multiline log entry keeps on going but not as long as before 01 Feb 2010 - 10:39:01.762, DEBUG - KEY1:but key 1 is still going and going and going and going and going and going and going and going and going and going and going and going and going okay开发者_运维技巧 enough
I was trying to do something like:
pcregrep -M 'KEY1(.*\n)+' logfile but definitely doesn't work right.if you are on *nix, you can use the shell
#!/bin/bash
read -p "Enter key: " key
awk -vkey="$key" '
$0~/DEBUG/ && $0 !~key{f=0}
$0~key{ f=1 }
f{print} ' file
output
$ cat file
01 Feb 2010 - 10:39:01.755, DEBUG - KEY1:randomtext
blah
blah2 T
blah3 T
blah4 F
blah5 F
blah6
blah7
01 Feb 2010 - 10:39:01.757, DEBUG - KEY1:somethngelse
01 Feb 2010 - 10:39:01.758, DEBUG - KEY2:randomtest
this is a test
01 Feb 2010 - 10:39:01.760, DEBUG - KEY1:more logs here
01 Feb 2010 - 10:39:01.762, DEBUG - KEY1:eve more here
this is another multiline log entry
keeps on going
but not as long as before
01 Feb 2010 - 10:39:01.763, DEBUG - KEY2:testing
test test test
end of key2
01 Feb 2010 - 10:39:01.762, DEBUG - KEY1:but key 1 is still going
and going
and going
and going
and going
and going
and going
and going
and going
and going
and going
and going
and going
okay enough
01 Feb 2010 - 10:39:01.762, DEBUG - KEY3:and so on
and on
$ ./shell.sh
Enter key: KEY1
01 Feb 2010 - 10:39:01.755, DEBUG - KEY1:randomtext
blah
blah2 T
blah3 T
blah4 F
blah5 F
blah6
blah7
01 Feb 2010 - 10:39:01.757, DEBUG - KEY1:somethngelse
01 Feb 2010 - 10:39:01.760, DEBUG - KEY1:more logs here
01 Feb 2010 - 10:39:01.762, DEBUG - KEY1:eve more here
this is another multiline log entry
keeps on going
but not as long as before
01 Feb 2010 - 10:39:01.762, DEBUG - KEY1:but key 1 is still going
and going
and going
and going
and going
and going
and going
and going
and going
and going
and going
and going
and going
okay enough
I had a similar requirement and decided to code a little tool (in .net) that parses log files for me and write the result to standard output.
Maybe you find it useful. Works on Windows and Linux (Mono)
See here: https://github.com/iohn2000/ParLog
A tool to filter log files for log entries that contain a specific (regex) pattern. Works also with multiline log entries. e.g.: show only log entries from a certain workflow instance. Writes the result to standard output. Use '>' to redirect into a file
default startPattern is :
^[0-9]{2} [\w]{3} [0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}
this corresponds to date format: e.g.: 04 Feb 2017 15:02:50,778
Parameters are:
f:wildcard a file name or wildcard for multiple files
p:pattern the regex pattern to filter the file(s)
s:startPattern regex pattern to define when a new log entry starts
Example :
ParLog.exe -f=*.log -p=findMe
Adding on to ghostdog74's answer (thank you very much btw, it works great)
Now takes command line input in the form of "./parse file key" and handles loglevels of ERROR as well as DEBUG
#!/bin/bash awk -vkey="$2" ' $0~/DEBUG|ERROR/ && $0 !~key{f=0} $0~key{ f=1 } f{print} ' $1
精彩评论