开发者

How to sort by hh:mm:ss.xx in ksh in AIX 5.3?

开发者 https://www.devze.com 2023-01-30 17:49 出处:网络
I have many log files l开发者_如何学Pythonike this: ...... ...... cpu time 9.05 seconds real time 8:02.07

I have many log files l开发者_如何学Pythonike this:


......

......

cpu time 9.05 seconds

real time 8:02.07

......

......

cpu time 2:25.23

real time 1:39:44.15

......

......


To get all the times, I simply grep all the cpu time and real time.

Then, sort the grep output files.

I am using AIX 5.2, there is sort by string or by numberic.

But, there is no sort by hour:minute:second.

To solve this problem, I pass the grep output lines to a while loop.

Then, create a new variables using sed 's/:/00/g'

This new var will make the hh:mm:ss.xx becomes hh00mm00ss.xx

and then sort by this new variable as numeric.

Using this way, I can find out the most time-consuming steps.

This work around can do but the speed is a little bit slow.

Can anyone have a better alternative ?

Thanks in advance.

Alvin SIU


In the paper 'Theory and Practice in the Construction of a Working Sort Routine', J P Linderman shows that the best way to get good performance out of the system sort command (which is the 'sort routine' he was working on) with complex keys was to create commands to generate keys that make the comparisons simple. In the example, the sort command with the complex key was:

sort -t' ' -k 9,9.2 -k3 -k17

The alternative mechanism used a key generator to make it easy to sort:

keygen | sort | keystrip

and the key generator was:

awk -F' ' '{printf "%s:%s:%s:%s\n", substr($9, 1, 2), $3, $17, $0}'

and the key stripper was:

awk -F':' {printf "%s\n", $4}'

For the test data Lindeman was working with, this reduced the elapsed time from around 2100 seconds for the elaborate sort command to about 600 seconds for the awk | sort | awk combination.


Adopting that idea here, I'd use a Perl script to present the disparate time values uniformly in a format that sort can handle trivially.

In this case, you seem to have a variety of time formats to worry about:

cpu time 9.05 seconds
real time 8:02.07
cpu time 2:25.23
real time 1:39:44.15

It is not clear whether you need to preserve the context of the lines you are sorting, but it seems to me that I'd convert the times to a canonical form. Do you need to allow for 3-digit hours of real time? If the time goes to 20.05 seconds, does the suffix remain? If the time goes to 80.05 seconds, is that printed as 1:20.05? I'm assuming yes...

#!/usr/bin/env perl
use strict;
use warnings;

while (<>)
{
    if ($_ =~ m/ (?:cpu|real)\stime\s
                 (?:
                 (?:(\d+):)?      # Hours
                 (\d\d?):         # Minutes
                 )?
                 (\d\d?(?:\.\d+)) # Seconds
               /msx)
    {
        my($hh, $mm, $ss) = ($1, $2, $3);
        $hh //= 0;
        $mm //= 0;
        $_ = sprintf "%03d:%02d:%05.2f|%s", $hh, $mm, $ss, $_;
    }
    print;
}

Given the input data:

cpu time 9.05 seconds
real time 8:02.07
cpu time 2:25.23
real time 1:39:44.15
cpu time 25.23 seconds
real time 39:44.15
cpu time 5.23 seconds
real time 44.15 seconds
real time 1:44.15
real time 1:04.15
real time 21:04.15
real time 1:01:04.15
real time 32:21:04.15
real time 122:21:04.15

This generates the output data:

000:00:09.05|cpu time 9.05 seconds
000:08:02.07|real time 8:02.07
000:02:25.23|cpu time 2:25.23
001:39:44.15|real time 1:39:44.15
000:00:25.23|cpu time 25.23 seconds
000:39:44.15|real time 39:44.15
000:00:05.23|cpu time 5.23 seconds
000:00:44.15|real time 44.15 seconds
000:01:44.15|real time 1:44.15
000:01:04.15|real time 1:04.15
000:21:04.15|real time 21:04.15
001:01:04.15|real time 1:01:04.15
032:21:04.15|real time 32:21:04.15
122:21:04.15|real time 122:21:04.15

Which can be fed into a simple sort, to yield:

000:00:05.23|cpu time 5.23 seconds
000:00:09.05|cpu time 9.05 seconds
000:00:25.23|cpu time 25.23 seconds
000:00:44.15|real time 44.15 seconds
000:01:04.15|real time 1:04.15
000:01:44.15|real time 1:44.15
000:02:25.23|cpu time 2:25.23
000:08:02.07|real time 8:02.07
000:21:04.15|real time 21:04.15
000:39:44.15|real time 39:44.15
001:01:04.15|real time 1:01:04.15
001:39:44.15|real time 1:39:44.15
032:21:04.15|real time 32:21:04.15
122:21:04.15|real time 122:21:04.15

And from which the sort column can be stripped with 'sed' to yield:

cpu time 5.23 seconds
cpu time 9.05 seconds
cpu time 25.23 seconds
real time 44.15 seconds
real time 1:04.15
real time 1:44.15
cpu time 2:25.23
real time 8:02.07
real time 21:04.15
real time 39:44.15
real time 1:01:04.15
real time 1:39:44.15
real time 32:21:04.15
real time 122:21:04.15

So, given that the data file is 'xx.data' and the Perl script is xx.pl, the command line is:

perl xx.pl xx.data | sort | sed 's/^[^|]*|//'


If you show your script it would help, however I suspect that the while loop is unnecessary. Try something like this:

grep -E '^(cpu|real) time' | sed 's/:/00/' | sort -n
0

精彩评论

暂无评论...
验证码 换一张
取 消