First, please pardon my rusty Perl. I'm trying to modify Bugzilla's "whine.pl" to generate lists of bugs sorted by severity.
So it gives me an array of hash references. Each hash contains a bunch of information about a particular bug (id, assignee, severity, etc). I want to sort the array by severity. What's the best way to do this?
I'd come up with a couple of possibilities. One is to create five arrays (one for each level of severity), then loop over the array and push the hash refs into the appropriate severity level array. After this, I could reassemble them and replace the original array with the sorted one.
Another way that my friend came up with would be to assign the severity levels (store开发者_JAVA百科d as text in the hashes) to some nubmers, and cmp them. Maybe something like this?
sub getVal {
my $entry = $_[0];
%lookup = ( "critical" => 0, ... );
return $lookup(entry("bug_severity"));
}
@sorted = sort { getVal($a) <=> getVal($b) } @unsorted;
To avoid calling getVal more times than necessary, you can use "decorate, sort, undecorate". Decorate is getting the information you actually care about for the sort:
my @decorated = map { [ $_, getVal($_) ] } @unsorted;
Then sort the decorated list:
my @sortedDecorate = sort { $a->[1] <=> $b->[1] } @decorated;
Then get the original information back (undecorate):
my @sorted = map { $_->[0] } @sortedDecorate;
Or the more Perl-ish way to do it:
@sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [ $_, getVal($_) ] } @unsorted;
You can use the Schwartzian Transform:
my @sorted = map { $_->[1] }
sort { $a->[0] <=> $b->[0] }
map { [ $lookup{$_->{bug_severity}, $_ ] }
@unsorted;
Explanation:
map { [ $lookup{$_->{bug_severity}, $_ ] } @unsorted;
maps each bug to an array reference whose first element is the numerical bug severity from the lookup table. Using the Schwartzian Transform, you look up the value only once for each bug in @unsorted
.
Then,
sort { $a->[0] <=> $b->[0] }
sorts that array by the first element. Finally,
@sorted = map { $_->[1] }
pulls out the original bugs from the array returned by sort
.
There is really no need for getval
when all it's doing is a hash lookup.
For automatically generating efficient sorters, CPAN module Sort::Maker is excellent:
use strict; use warnings;
use Sort::Maker;
my @bugs = (
{ name => 'bar', bug_severity => 'severe' },
{ name => 'baz', bug_severity => 'noncritical' },
{ name => 'foo', bug_severity => 'critical' },
);
my $sorter = make_sorter('ST',
name => 'severity_sorter',
init_code => 'my %lookup = (
critical => 0,
severe => 1,
noncritical => -1 );',
number => [ code => '$lookup{$_->{bug_severity}}' ],
);
use Data::Dumper;
print Dumper $_ for severity_sorter( @bugs );
Output:
$VAR1 = { 'name' => 'baz', 'bug_severity' => 'noncritical' }; $VAR1 = { 'name' => 'foo', 'bug_severity' => 'critical' }; $VAR1 = { 'name' => 'bar', 'bug_severity' => 'severe' };
Note that the number of lookups that need to be made when using the naive method depends on the number of elements in @unsorted
. We can count them using the simple program:
#!/usr/bin/perl
use strict;
use warnings;
my ($n_elements) = @ARGV;
my @keys = qw(a b c);
my %lookup = map { $keys[$_-1] => $_ } 1 .. @keys;
my @unsorted = map { $keys[rand 3] } 1 .. $n_elements;
my $n_lookups;
my @sorted = sort {
$n_lookups += 2;
$lookup{$a} <=> $lookup{$b}
} @unsorted;
print "It took $n_lookups lookups to sort $n_elements elements\n";
Output:
C:\Temp> tzt 10 It took 38 lookups to sort 10 elements C:\Temp> tzt 100 It took 978 lookups to sort 100 elements C:\Temp> tzt 1000 It took 10916 lookups to sort 1000 elements C:\Temp> tzt 10000 It took 113000 lookups to sort 10000 elements
Therefore, one would need more information to decide whether the naive sort or using the Schwartzian Transform is the appropriate solution.
And here is a simple benchmark which seems to agree with @Ether's argument:
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw( cmpthese );
my ($n_elements) = @ARGV;
my @keys = qw(foo bar baz);
my %lookup = map { $keys[$_] => $_ } 0 .. $#keys;
my @unsorted = map { {v => $keys[rand 3]} } 1 .. $n_elements;
cmpthese(-1, {
naive => sub {
my @sorted = sort {
$lookup{$a->{v}} <=> $lookup{$b->{v}}
} @unsorted;
},
schwartzian => sub {
my @sorted = map { $_->[1] }
sort { $a->[0] <=> $b->[0] }
map { [$lookup{$_->{v}}, $_] }
@unsorted;
}
});
Output:
C:\Temp> tzt 10 Rate schwartzian naive schwartzian 18842/s -- -29% naive 26357/s 40% -- C:\Temp> tzt 100 Rate naive schwartzian naive 1365/s -- -11% schwartzian 1532/s 12% -- C:\Temp> tzt 1000 Rate naive schwartzian naive 121/s -- -11% schwartzian 135/s 12% --
I like your proposed solution:
my %sevs = (critical => 0, high => 1, ...);
my @sorted = sort { $sevs{$a->{bug_severity}} <=> $sevs{$b->{bug_severity}} } @unsorted
You could use a lookup table to determine the ordering of the bugzilla severities, like this (using sample data to illustrate):
use strict; use warnings;
use Data::Dumper;
my @bugInfo = (
{ id => 1,
assignee => 'Bob',
severity => 'HIGH'
},
{ id => 2,
assignee => 'Anna',
severity => 'LOW'
},
{ id => 3,
assignee => 'Carl',
severity => 'EXTREME'
},
);
my %severity_ordering = (
EXTREME => 0,
HIGH => 1,
MEDIUM => 2,
LOW => 3,
);
sub byseverity
{
$severity_ordering{$a->{severity}} <=> $severity_ordering{$b->{severity}}
}
my @sortedBugs = sort byseverity @bugInfo;
print Dumper(\@sortedBugs);
yields:
$VAR1 = [
{
'assignee' => 'Carl',
'id' => 3,
'severity' => 'EXTREME'
},
{
'assignee' => 'Bob',
'id' => 1,
'severity' => 'HIGH'
},
{
'assignee' => 'Anna',
'id' => 2,
'severity' => 'LOW'
}
];
精彩评论