开发者

Perl- Reading sorted array elements into a hash with sorted keys

开发者 https://www.devze.com 2023-04-04 14:21 出处:网络
so i have an array (say @array) with sorted values between 0 and 1, and also and a hash (say %hash) whose keys are sorted and are numbers between 0 and 1. the values for each key in the hash is0. Now,

so i have an array (say @array) with sorted values between 0 and 1, and also and a hash (say %hash) whose keys are sorted and are numbers between 0 and 1. the values for each key in the hash is 0. Now, I need to look at each element of @array, find the key in %hash which is immediately smaller than it, and increment the corresponding value by 1 . that is, the keys serve as a lowerbounds for intervals

if say

$array = (0.15,0.33,0.67,0.87) 
and %hash = ("0.25", 0, "0.50", 0, "0.75", 0)

and I take $array[1] = 0.33

then, I need to be able to determine that $array[1] is greater than 开发者_运维问答0.25 BUT less than 0.5 and, thus, increment the value for "0.25" by 1 giving me an updated hash %hash = ("0.25", 1, "0.50", 0, "0.75", 0).

I hope this made sense! thanks in advance!!!


Hash does not store keys in sorted order. You must rethink your approach to the problem.


You're building a frequency distribution for intervals or ranges. CPAN has modules that will do that. If you can reformulate your problem to agree with how those modules understand frequency distributions, you'll be able to save yourself a little bit of trouble and gain access to other statistical tools that might be useful for your project. An example:

use Statistics::Descriptive;
my @data = (0.15, 0.33, 0.67, 0.87);
my @bins = (0.25, 0.50, 0.75, 1.00);
my $stat = Statistics::Descriptive::Full->new();
$stat->add_data(@data);
my $freq = $stat->frequency_distribution_ref(\@bins);

The distribution in $freq will be a hash reference like this:

$freq = {
  '0.25' => 1
  '0.5'  => 1,  # N of items x, such that PREVIOUS_BIN_VAL < x <= .50
  '0.75' => 1,
  '1'    => 1,
};

If you can't modify your problem, then you'll need to compute the distribution yourself, but you can take an important cue from Statistics::Descriptive. In particular, it will be helpful for you to have an ordered list of bin values. Here's an illustration:

my @data = (0.15, 0.33, 0.67, 0.87);
my @bins = (0.25, 0.50, 0.75);    # Include 0.0 if you want 0.15 to be tallied.
my %freq = map {$_ => 0} @bins;

for my $d (@data){
    for my $b (reverse @bins){
        do { $freq{$b} ++; last } if $d >= $b;
    }
}


As far as I understood, you want to keep track of how many items in $array are less than the key's in %hash

So for each key value in the hash, you can just retrieve all items from the array that are less then the key in a list and get the count of it. You can use grep for this

use strict;
use warnings;
use Data::Dumper;    

my $array = [qw (0.15 0.33 0.67 0.87 1.5) ] ;
my %hash = (0.25 => 0, 0.50 => 0, 0.75 => 0, 0.05 => 0);    

for my $k (keys %hash) {
        my @filtered =  grep { $_ < $k } @$array;
        $hash{$k} = @filtered;
        #$hash{$k} = @filtered ? 1 : 0 # if you just want a flag 
}

print Dumper(\%hash);


If your hash keys are evenly spaced, like in your example, each can be calculated by a simple formula like $biggestSmaller = int(i*4)/4. If not, you need an auxiliary index like @keys = sort keys %hash - it could also be a binary tree, but this is trivial enough that a simple list ought to do (and if speed is not important, you could even be so lazy as to search bottom up, instead of implement binary search).

0

精彩评论

暂无评论...
验证码 换一张
取 消