Is this the fastest (execution time) way to find the longest element in a list?
#!/usr/bin/env perl
use warnings;
use 5.012;
use List::Util qw(reduce);
use List::Util::XS;
my @array = qw( one two three four five six seven eight nine ten eleven 开发者_如何学Go);
my $l = reduce{ length($a) > length($b) ? $a : $b } @array;
say $l;
When only trying to find one element of a list, there is no need to construct an N sized data structure as many answers here have done. The fastest O(N)
way to do this is to walk the array, keeping track of the largest element. That way you have O(N)
accesses of the list, and O(1)
memory usage.
sub longest {
my $max = -1;
my $max_i = 0;
for (0 .. $#_) { # for each index
my $len = length $_[$_]; # only get length once per item
if ($len > $max) { # save index and update max if larger
$max = $len;
$max_i = $_;
}
}
$_[$max_i] # return the largest item
}
If you are going to be running the above code many times, I would suggest inlining the body of the subroutine.
EDIT:
drewk's benchmark revealed that the array index in the above code is a bit of a bottleneck. Experimenting a little more, I have finally found a method that is faster than the reduce
solution:
sub fastest {
my $max = -1;
my $max_ref;
for (@_) {
if (length > $max) { # no temp variable, length() twice is faster
$max = length;
$max_ref = \$_; # avoid any copying
}
}
$$max_ref
}
which results in the following benchmark:
Rate longest drewk reduce fastest
longest 44245/s -- -21% -30% -47%
drewk 55854/s 26% -- -11% -33%
reduce 63014/s 42% 13% -- -25%
fastest 83638/s 89% 50% 33% --
Here is slightly modified version of OMG_peanuts with for and less variables:
my $len = length $array[0];
my $longest = 0;
for my $i (1 .. $#array) {
my $i_len = length $array[$i];
if($i_len > $len) {
$longest = $i;
$len = $i_len;
}
}
my $l = $array[$longest];
I was playing a bit with benchmarks, getting this for small numbers (original array)
Rate REDUCE TMPVAR TMPFOR
REDUCE 234862/s -- -0% -7%
TMPVAR 235643/s 0% -- -6%
TMPFOR 251326/s 7% 7% --
and this for larger number or items (original array x 100
)
Rate TMPVAR TMPFOR REDUCE
TMPVAR 3242/s -- -28% -32%
TMPFOR 4503/s 39% -- -5%
REDUCE 4750/s 47% 5% --
Note that suitability of algorithm heavily varies due to data specifics (I would guess longer strings may increase weight of length
function in algorithm).
EDIT: Here is full code for the benchmark (long array version, short is missing x 100
in array definition)
use Benchmark qw(:all);
use List::Util qw(reduce);
my @array = qw( one two three four five six seven eight nine ten eleven ) x 100;
cmpthese(-2, {
REDUCE => sub {
my $l = reduce{ length($a) gt length($b) ? $a : $b } @array;
},
TMPVAR => sub {
my $idx = 1;
my $lastLength = length $array[0];
my $lastElt = $array[0];
my $listLength = scalar @array;
while ($idx < $listLength) {
my $tmpLength = length $array[$idx];
if ($tmpLength > $lastLength) {
$lastElt = $array[$idx];
$lastLength = $tmpLength
}
$idx++
}
my $l = $lastElt;
},
TMPFOR => sub {
my $len = length $array[0];
my $longest = 0;
for my $i (1 .. $#array) {
my $i_len = length $array[$i];
if($i_len > $len) {
$longest = $i;
$len = $i_len;
}
}
my $l = $array[$longest];
},
});
My fastest is:
sub drewk {
my $len = -1;
for (@_) {
my $tmp=length($_);
if ( $tmp > $len ) {
$longest = $_;
$len = $tmp;
}
}
return $longest;
}
But benchmarking against:
sub strom {
my $max = -1;
my $max_i = 0;
for (0 .. $#_) { # for each index
my $len = length $_[$_]; # only get length once per item
if ($len > $max) { # save index and update max if larger
$max = $len;
$max_i = $_;
}
}
$_[$max_i] # return the largest item
}
sub red {
return reduce{ length($a) > length($b) ? $a : $b } @_;
}
Shows that reduce
is fastest:
Rate strom drewk reduce
strom 1323455/s -- -38% -45%
drewk 2144549/s 62% -- -10%
reduce 2390707/s 81% 11% --
The other benchmark is Eric Strom's sub
A little golfish:
my @unsorted = qw( one two three four five six seven eight nine ten eleven );
my $longest = (
map { $_->[0] }
sort { $b->[1] <=> $a->[1] }
map { [ $_, length $_ ] } @unsorted
)[0];
say $longest;
EDIT: the map/sort/map is a Schwartzian transform for anyone unfamiliar with that technique and wondering.
Assuming the goal is to just find the longest string, not its index:
my $longest = $array[0];
my $len = length $longest;
for my $str (@array) {
if ( length($str) > $len ) {
$longest = $str;
$len = length($str);
}
}
If you really want to cut-off number of computed length
's, then take a look at Schwartian transform and adopt it to your problem.
EDIT:
I see that no one posted complete example I meant, so here it is (I haven't benchmarked it yet as I'm not in from of my personal computer):
my @array = qw( one two three four five six seven eight nine ten eleven );
my $longest = (
reduce { $a->[1] > $b->[1] ? $a : $b }
map { [ $_, length $_ ] }
@array
)[0];
say $longest;
This appears to be significantly faster than other solutions (based on fastest_Eric_Storm),
use warnings;
use 5.012;
use Benchmark qw(:all) ;
use List::Util qw(reduce);
my @array = map { ($_) x 50 } qw( one two three four five six seven eight nine ten eleven );
sub list_util_xs {
my $l = reduce{ length($a) > length($b) ? $a : $b } @array;
return $l;
}
sub fastest_Eric_Strom {
my $max = -1; my $max_ref;
for (@array) {
if (length > $max) {
$max = length;
$max_ref = \$_;
}
}
return $$max_ref;
}
sub ysth {
my $longest = $array[0];
my $len = length $longest;
for my $str (@array) {
if ( length($str) > $len ) {
$longest = $str;
$len = length($str);
}
}
return $longest;
}
sub mpapec {
my $max = -1;
my $max_ref;
length > $max and ($max, $max_ref) = (length, \$_) for @array;
return $$max_ref;
}
cmpthese( -10, {
'list_util_xs' => sub{ list_util_xs() },
'fastest_Eric_Storm' => sub{ fastest_Eric_Strom() },
'ysth' => sub{ ysth() },
'mpapec' => sub{ mpapec() },
});
output
Rate list_util_xs fastest_Eric_Storm ysth mpapec
list_util_xs 13479/s -- -24% -24% -29%
fastest_Eric_Storm 17662/s 31% -- -0% -6%
ysth 17680/s 31% 0% -- -6%
mpapec 18885/s 40% 7% 7% --
You could use some temp var to avoid calculing length again and again:
my @unsorted = qw( one two three four five six seven eight nine ten eleven );
my $idx = 1;
my $lastLength = length $unsorted[0];
my $lastElt = $unsorted[0];
my $listLength = scalar @unsorted;
while ($idx < $listLength) {
my $tmpLength = length $unsorted[$idx];
if ($tmpLength > $lastLength) {
$lastElt = $unsorted[$idx];
$lastLength = $tmpLength
}
$idx++
}
print "Longest element:$lastElt";
Benchmark results :
Rate REDUCE TMPVAR
REDUCE 169297/s -- -29%
TMPVAR 237926/s 41% --
First I tested if all routines give me the right result. MBO's routine didn't pass the first test (it returns an array-reference); to give him a second change I modified the routine to get the right result.
I run the benchmark some times and I didn't always get the same order. So I would say ( as already sed here ) ysth's and fasest_Eric_Strom are the fastest but list_utils reduce is almost as fast as they are; what's easy to read from the results is that David Precious sort-version is the slowest and MBO's modified reduce-version is second slowest.
My conclusion: list_utils reduce is the winner of the best price-performance ratio.
edit:
I was too fast with the prize giving ceremony:
List::Util - reduce - length - encoding - question
David_Precious 64147/s -- -36% -73% -79% -80% -81% -85% -86% -87%
MBO 100195/s 56% -- -58% -67% -69% -70% -77% -79% -80%
OMG_peanuts 237772/s 271% 137% -- -21% -27% -30% -45% -50% -52%
longest_Eric_Strom 300466/s 368% 200% 26% -- -8% -11% -31% -36% -40%
drewk 325883/s 408% 225% 37% 8% -- -4% -25% -31% -34%
bvr 338156/s 427% 237% 42% 13% 4% -- -22% -28% -32%
list_util_xs 434114/s 577% 333% 83% 44% 33% 28% -- -8% -13%
fastest_Eric_Strom 471812/s 636% 371% 98% 57% 45% 40% 9% -- -5%
ysth 497198/s 675% 396% 109% 65% 53% 47% 15% 5% --
.
#!/usr/bin/env perl
use warnings;
use 5.012;
use Benchmark qw(:all) ;
use List::Util qw(reduce);
my @array = qw( one two three four five six seven eight nine very_long_long ten eleven );
sub list_util_xs {
my $l = reduce{ length($a) > length($b) ? $a : $b } @array;
return $l;
}
sub longest_Eric_Strom {
my $max = -1; my $max_i = 0;
for (0 .. $#array) {
my $len = length $array[$_];
if ($len > $max) {
$max = $len;
$max_i = $_;
}
}
return $array[$max_i];
}
sub fastest_Eric_Strom {
my $max = -1; my $max_ref;
for (@array) {
if (length > $max) {
$max = length;
$max_ref = \$_;
}
}
return $$max_ref;
}
sub David_Precious {
my $longest = ( map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { [ $_, length $_ ] } @array )[0];
return $longest;
}
sub MBO {
my $longest = ( reduce { $a->[1] > $b->[1] ? $a : $b } map { [ $_, length $_ ] } @array )[0];
return $longest->[0];
}
sub drewk {
my $len = -1; my $longest;
for (@array) {
my $tmp=length($_);
if ( $tmp > $len ) {
$longest = $_;
$len = $tmp;
}
}
return $longest;
}
sub ysth {
my $longest = $array[0];
my $len = length $longest;
for my $str (@array) {
if ( length($str) > $len ) {
$longest = $str;
$len = length($str);
}
}
return $longest;
}
sub bvr {
my $len = length $array[0];
my $longest = 0;
for my $i (1 .. $#array) {
my $i_len = length $array[$i];
if($i_len > $len) {
$longest = $i;
$len = $i_len;
}
}
return $array[$longest];
}
sub OMG_peanuts {
my $idx = 1;
my $lastLength = length $array[0];
my $lastElt = $array[0];
my $listLength = scalar @array;
while ($idx < $listLength) {
my $tmpLength = length $array[$idx];
if ($tmpLength > $lastLength) {
$lastElt = $array[$idx];
$lastLength = $tmpLength
}
$idx++
}
return $lastElt;
}
cmpthese( -10, {
'list_util_xs' => sub{ list_util_xs() },
'longest_Eric_Storm' => sub{ longest_Eric_Strom() },
'fastest_Eric_Storm' => sub{ fastest_Eric_Strom() },
'David_Precious' => sub{ David_Precious() },
'MBO' => sub{ MBO() },
'drewk' => sub{ drewk() },
'ysth' => sub{ ysth() },
'OMG_peanuts' => sub{ OMG_peanuts() },
'bvr' => sub{ bvr() },
});
You could reduce the number of times you have to calculate the string's length by reducing to a struct or array, containing the length next to the string itself.
Further, the iteration is (to be) optimized by the reduce
algorithm, the length
call is hardly optimizeable.
精彩评论