开发者

How can I correctly calculate the lengths of fields in a CSV dcoument using Perl?

开发者 https://www.devze.com 2022-12-18 04:56 出处:网络
I have a datas et and like to do a simple while operation with a Perl script. Here is a small extraction from the dataset:

I have a datas et and like to do a simple while operation with a Perl script. Here is a small extraction from the dataset:

"number","code","country","gamma","X1","X2","X3","X4","X5","X6" 1,"DZA","Algeria","0.01",7.44,47.3,0.46,0,0,0.13 2,"AGO","Angola","0.00",6.79,"NULL",0.21,1,0,0.28 3,"BEN","Benin","-0.01",7.02,38.9,0.27,1,0,0.05 4,"BWA","Botswana","0.06",6.28,45.7,0.42,1,0,0.07 5,"HVO","Burkina Faso","0.00"开发者_如何转开发,6.15,36.3,0.08,1,0,0.05 6,"BDI","Burundi","0.00",6.38,41.8,0.18,1,0,0

The script should count the length of every , separated field and store the highest values into an array.

However, the saving doesn't work properly. Here is a part of the code:

@maxl = map length, @terms;

while(`<INFILE>`) {
$_ =~ s/[\"\n]//g ;
@terms = split/$sep/, $_;
@lengths = map length, @terms;
for($k = 0, $k <= $#terms, $k++) { 
    if($lengths[$k] > $maxl[$k]) {
    $maxl[$k] = $lenghts[$k];
    }
}
print "@lengths\n";
}

Now the @maxl uses an earlier part from the code where it uses the second line of the dataset. When I use a print command just to see the values of the @maxl operation i get:

1 3 7 4 4 4 4 1 1 5

In the while loop I used another print statement just to see the other values, I get:

1 3 6 4 4 4 4 1 1 4
1 3 5 5 4 4 4 1 1 4
1 3 8 4 4 4 4 1 1 4
1 3 12 4 4 4 4 1 1 4
1 3 7 4 4 4 4 1 1 1
1 3 8 4 4 4 4 1 1 4
1 3 10 4 4 4 4 1 1 4
1 3 16 5 4 4 4 1 1 4
2 3 4 5 3 4 4 1 1 4
2 3 7 4 4 4 4 1 1 4
2 3 5 4 4 4 4 1 1 4
2 3 5 4 4 4 4 1 1 4
2 3 8 4 4 4 4 1 1 4
2 3 5 4 4 4 1 1 1 4

The fourth column eg has obviously values which are greater than 3. The while loop was supposed to save the greatest values and substitute those values into @maxl.

What went wrong?


...in the for loop the comma are wrong

for($k = 0, $k <= $#terms, $k++)

however, after cleaning that up there still seems to be a problem...


there's a typo here $maxl[$k] = $lenghts[$k]; for starters (which 'use strict' would have caught)

consider using Text::CSV for more reliable parsing of comma-separated data (it can also handle other separators):

#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;

my $csv = Text::CSV->new();
my @max_lengths;

while ( my $line = <INFILE> ) {

    die "Unable to parse '$line'" unless $csv->parse($line);

    my @column_lengths = map { length } $csv->fields();

    for my $i ( 0 .. $#column_lengths ) {
        if ( $column_lengths[$i] > ($max_lengths[$i] || 0) ) {
            $max_lengths[$i] = $column_lengths[$i];
        }
    }
}

print "MAX LENGTHS OF EACH FIELD: @max_lengths\n";
0

精彩评论

暂无评论...
验证码 换一张
取 消