My test file has "n" number of lines and between each line there is a ^M, which in turn makes it one big string. The code I am working with opens said file and should parse out a he开发者_开发百科ader and then the subsequent rows, then searches for the Directory Path and File name. But because the file just ends up as a big string it doesn't work correctly
#!/usr/bin/perl
#use strict;
#use warnings;
open (DATA, "<file.txt") or die ("Unable to open file");
my $search_string = "Directory Path";
my $column_search = "Filename";
my $header = <DATA>;
my @header_titles = split /\t/, $header;
my $extract_col = 0;
my $col_search = 0;
for my $header_line (@header_titles) {
last if $header_line =~ m/$search_string/;
$extract_col++;
}
for my $header_line (@header_titles) {
last if $header_line =~m/$column_search/;
$col_search++;
}
print "Extracting column $extract_col $search_string\n";
while ( my $row = <DATA> ) {
last unless $row =~ /\S/;
chomp $row;
my @cells = split /\t/, $row;
$cells[74]=~s/:/\//g;
$cells[$extract_col]= $cells[74] . $cells[$col_search];
print "$cells[$extract_col] \n";
}
When i open the test file in VI i have used
:%s/^M/\r/g
and that removes the ^M's but how do i do it inside this perl program? When i tried a test program and inserted that s\^M/\r/g
and had it write to a different file it came up as a lot of Chinese characters.
If mac2unix isn't working for you, you can write your own mac2unix as a Perl one-liner:
perl -pi -e 'tr/\r/\n/' file.txt
That will likely fail if the size of the file is larger than virtual memory though, as it reads the whole file into memory.
For completeness, let's also have a dos2unix:
perl -pi -e 'tr/\r//d' file.txt
and a unix2dos:
perl -pi -e 's/\n/\r\n/g' file.txt
Before you start reading the file, set $/
to "\r"
. This is set to the linefeed character by default, which is fine for UNIX-style line endings, and almost OK for DOS-style line endings, but useless for the old Mac-style line endings you are seeing. You can also try mac2unix on your input file if you have it installed.
For more, look for "INPUT_RECORD_SEPARATOR" in the perlvar manpage.
Did this file originate on a windows system? If so, try running the dos2unix
command on the file before reading it. You can do this before invoking the perl script or inside the script before you read it.
You might want to set $\ (input record separator) to ^M in the beginning of your script, such as:
$\ = "^M";
perl -MExtUtils::Command -e dos2unix file
精彩评论