At the beginning I simply used the following to count the length of each line:
while(<FH>){
chomp;
$length=length($_);
}
but when I compared the result I got with the one produced by linux command WC, I found a problem:
all tab characters in my file are treated as of 1 character
length in perl, whereas it is 8
for wc
, so I did the following modification:
while(<FH>){
chomp;
my $length=length($_);
my $tabCount= tr/\t/\t/;
my $lineLength=$wc-$tabCount+($tabCount*8);
}
for the above code it works for all most all the cases now, except for one, in wc
not all tabs are counted, but only the one that has not be taken开发者_Python百科 with some characters, for example, if at the start of a line, I type in1234
and then press a tab, in wc
it is not counted as a tab, but the above code counted that, are there any ways I could adopt to solve this issue? Thanks
Solved it, used tab expansion, here is the code:
1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;
$length=length($string);
if anyone could give it an explanation, that would be awesome, I tested it to be working, but don't quite understand it. Anyways, thanks for all the help
I don't think tabs are your problem, wc doesn't count a tab as eight characters. I think your problem is that you're stripping EOLs but wc counts them. Also, you're not accumulating the lengths, you were just tracking the length of the last line. This:
while(<FH>){
chomp;
$length=length($_);
}
Should be more like this:
my $length = 0;
while(<FH>) {
$length += length($_);
}
# $length now has the total number of characters
Solved it, used tab expansion, here is the code:
1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;
$length=length($string);
if anyone could give it an explanation, that would be awesome, I tested it to be working, but don't quite understand it. Anyways, thanks for all the help
How about just calling wc from within perl?
$result = `wc -l /path/to/file`
精彩评论