I have SNP data and gen list data. I am looking for the position of SNP cotain in the gen list data when I compare with gen list. For example:
The SNP data :
Pos_start pos_end 14185 14185 .... .....
The gen list data:
5"side(pos_start) 3"sile(pos_end) 1 1527 1920 1777 .... .....
the result: in the position 14185 of SNP contain at the 16185 position of the gen list.
Below is my code but it has some problem in sort the number.
#!/usr/bin/perl -w
open(POS1,"<posi1.txt"); (I collect two data and save with posi1.txt)
@posi1=<POS1>;
开发者_StackOverflow社区 open(list,">list.txt");
@list1=@posi1;
@list2= sort num_last (@list1);
$list2 = join( '', @list2);
print $list2;
print list $list2."\n\n";
close(list);
sub num_last {
my ($num_a, $num_b);
$num_a=$a=~ /^[0-9]/;
$num_b=$b=~ /^[0-9]/;
if ($num_a && $num_b){
return $a<=>$b;
} elsif ($num_a){
return 1;
} elsif ($num_b){
return -1;
} else {
return $a cmp $b;
}
}
I would appreciate if you could give some pointers.
First of all, your sort sub does not operate on values you pass. It should be something like
sub num_last {
my ($num_a, $num_b);
my ($a,$b) = @_;
....
}
Than, you are really getting only first digit in a string if the string starts from digit. It's better add skipping all leading whitespaces, just in case.
($num_a) = $a =~ /^\s*(\d+)/;
($num_b) = $b =~ /^\s*(\d+)/;
\d+
is equivalent to [0-9]+
, but two chars shorter :). Braces force list context
so, $num_a
and $num_b
receives content of first matched group: (\d+)
.
Than, you don't need <=>
opertor, as $num_a
and $num_b
should be strings, so you can simplify your condition to:
if (!$num_a)
return -1;
if (!$num_b)
return 1;
return $a cmp $b;
Not sure, but it might be as simple as return $a cmp $b
, but I'm not sure if empty var is stringwise lesser than non-empty string and no perl at fingertips. So, final num_last function:
sub num_last{
my ($num_a, $num_b);
my ($a,$b) = @_;
($num_a) = $a =~ /^\s*(\d+)/;
($num_b) = $b =~ /^\s*(\d+)/;
if (!$num_a)
return -1;
if (!$num_b)
return 1;
return $a cmp $b;
}
If you need reverse sort, just replace my ($a,$b) = @_;
with my ($b,$a) = @_;
And, I've written it without any compiler help, so there might be some minor errors in it.
精彩评论