What is the easiest way to g开发者_运维技巧et a key with the highest value from a hash in Perl?
While the solution with sort:
(sort {$hash{$a} <=> $hash{$b}} keys %hash)[0]
found in some of the other answers is quite elegant, it doesn't perform as nicely as it looks. First off, the sort transforms an O(n)
search search operation into an O(n log n)
one. Secondly, the sort solution has n log n
hash look-ups. Hash look-ups are very good for certain operations, but when working with the entire hash, look-ups will be slower than using each
, keys
, or values
to iterate through the data structure. This is because the iterators do not need to calculate the hashes of keys, nor do they need to repeatedly walk through bins to find the values. And the overhead is not constant, but increasing as the hashes get larger.
Here are a few faster solutions:
use strict;
use warnings;
my %hash = (
small => 1,
medium => 5,
largest => 10,
large => 8,
tiny => 0.1,
);
Here is a solution using the each
iterator (an O(1)
operation done n
times):
sub largest_value (\%) {
my $hash = shift;
keys %$hash; # reset the each iterator
my ($large_key, $large_val) = each %$hash;
while (my ($key, $val) = each %$hash) {
if ($val > $large_val) {
$large_val = $val;
$large_key = $key;
}
}
$large_key
}
print largest_value %hash; # prints 'largest'
Or a faster version that trades memory for speed (it makes a copy of the hash):
sub largest_value_mem (\%) {
my $hash = shift;
my ($key, @keys) = keys %$hash;
my ($big, @vals) = values %$hash;
for (0 .. $#keys) {
if ($vals[$_] > $big) {
$big = $vals[$_];
$key = $keys[$_];
}
}
$key
}
print largest_value_mem %hash; # prints 'largest'
Here is the performance with various hash sizes:
10 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 111565/s -- -8% -13%
largest_value 121743/s 9% -- -5%
largest_value_mem 127783/s 15% 5% --
50 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 24912/s -- -37% -40%
largest_value 39361/s 58% -- -6%
largest_value_mem 41810/s 68% 6% --
100 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 9894/s -- -50% -56%
largest_value 19680/s 99% -- -12%
largest_value_mem 22371/s 126% 14% --
1,000 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 668/s -- -69% -71%
largest_value 2183/s 227% -- -7%
largest_value_mem 2341/s 250% 7% --
10,000 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 46.5/s -- -79% -81%
largest_value 216/s 365% -- -11%
largest_value_mem 242/s 421% 12% --
As you can see, if memory isn't much of an issue, the version with internal arrays is fastest, closely followed by the each
iterator, and in a distant third... sort
Not sure why everyone is doing this by hand...
use List::Util qw( reduce );
my $max_val_key = reduce { $hash{$a} > $hash{$b} ? $a : $b } keys %hash;
The following is more space-efficient and will run in O(n) instead of O(n log n) as compared to the other answers which sort the hash. It assumes values are integers greater than 0 and the hash is not empty, but should be easily extended for your case.
my $key_for_max_value;
my $max_value = -1;
while ((my $key, my $value) = each %hash) {
if ($value > $max_value) {
$max_value = $value;
$max_key = $key;
}
}
$key_for_max_value will now be the key corresponding to the highest value.
The keys sorted by value, from lowest to highest:
sort { $hash{$a} <=> $hash{$b} } keys %hash
The keys sorted by value, from highest to lowest:
reverse sort { $hash{$a} <=> $hash{$b} } keys %hash
And the first element
(reverse sort { $hash{$a} <=> $hash{$b} } keys %hash)[0]
Replace the spaceship with cmp
to taste.
my ($max_key, $max_val) = each %hash or die "hash is empty";
while (my ($key, $val) = each %hash) {
$max_key = $key, $max_val = $val if $val > $max_val;
}
my $highest_val = (sort { $hash{$a} <=> $hash{$b} } keys %hash)[0];
is likely to be what you want.
If you have a very large hash, you might want to use something like a Schwartzian transform:
my @array = map {[$hash{$_},$_]} keys %hash;
my $key_with_highest_value = (sort { $a->[0] <=> $b->[0] } @array)[0]->[1]
my $highest_val = (keys {$hash{$b} <=> $hash{$a}} keys %hash)[0];
If performance isn't an issue, I'd suggest a more literate programming solution.
use List::Util qw(max);
max keys %hash;
精彩评论