I'm learning Perl and noticed a rather peculiar quirk -- attempting to match one of multiple regex conditions in a while loop results in that loop going on for i开发者_运维百科nfinity:
#!/usr/bin/perl
my $hivar = "this or that";
while ($hivar =~ m/this/ig || $hivar =~ m/that/ig) {
print "$&\n";
}
The output of this program is:
this
that
that
that
that
[...]
I'm wondering why this is? Are there any workarounds that are less clumsy than this:
#!/usr/bin/perl
my $hivar = "this or that";
while ($hivar =~ m/this|that/ig) {
print "$&\n";
}
This is a simplification of a real-world problem I am encountering, and while I am interested in this in a practical standpoint, I also would like to know what behind-the-scenes is triggering this behavior. This is a question that doesn't seem to be very Google-compatible.
Thanks!
Tom
The thing is that there's a hidden value associated with each string, not with each match, that controls where a /g
match will attempt to continue, and accessible through pos($string)
. What happens is:
pos($hivar)
is 0,/this/
matches at position 0 and resetspos($hivar)
to 4. The second match isn't attempted because the or operator is already true.$&
becomes "this" and gets printed.pos($hivar)
is 4,/this/
fails to match because there's no "this" at position 4 or beyond. The failing match resetspos($hivar)
to 0./that/
matches at position 6 and resetspos($hivar)
to 10.$&
becomes "that" and gets printed.pos($hivar)
is 10,/this/
fails to match because there's no "this" at position 10 or beyond. The failing match resetspos($hivar)
to 0./that/
matches at position 6 and resetspos($hivar)
to 10.$&
becomes "that" and gets printed.
and steps 4 and 5 repeat indefinitely.
Adding the c
regex flag (which tells the engine not to reset pos
on a failed match) solves the problem in the example code you provided, but it might or might not be the ideal solution to a more complex problem.
精彩评论