开发者

Finding boundaries of substrings

开发者 https://www.devze.com 2023-01-18 06:56 出处:网络
I\'ve got a string that contains multiple substrings, each of which contains one or more \'E\' character. I am trying to get the coordinates of each of these sustrings using Perl and regex. Here is wh

I've got a string that contains multiple substrings, each of which contains one or more 'E' character. I am trying to get the coordinates of each of these sustrings using Perl and regex. Here is what I tried at first.

#!/usr/bin/perl
use strict;

my $str = "GGGFFEEIIEIIIIEEEIIIETTGGG";
foreach my $match($str =~ m/(E+)/)
{
  print "match: $match, coords: (". $-[0] .", ". $+[0] .")\n";
}

The terminal output looks like this...

> ./test
match: EE, coords: (5, 7)

so it is successfully finding the first substring. But I would like to identify each substring. So I added the 'g' modifier to the regex like so...

#!/usr/bin/perl
use strict;

my $str = "GGGFFEEIIEIIIIEEEIIIETTGGG";
foreach my $match($str =~ m/(E+)/g)
{
  print "match: $match, coords: (". $-[0] .", ". $+[0] .")\n";
}

which gives the following terminal output.

> ./test
match: EE, coords: (20, 21)
match: E, coords: (20, 21)
match: EEE, coords: (20, 21)
match: E, coords: (20, 21)

As you can see, it finds each substring correctly, b开发者_如何学Cut I am only pulling out the coordinates of the last match. Maybe I'm using $- and $+ incorrectly? Any ideas how I can grab these coordinates correctly? Thanks.


foreach builds the list of matches first, and then iterates over them. At that point, @- and @+ contain only the data from the last match. Try:

#!/usr/bin/perl
use strict;

my $str = "GGGFFEEIIEIIIIEEEIIIETTGGG";
while ($str =~ m/(E+)/g)
{
  printf "match: %s, coords: (%d, %d)\n", $1, $-[0], $+[0];
}
0

精彩评论

暂无评论...
验证码 换一张
取 消