I want to do matching in the following way for a large multiline text:
I have a few matching patterns:
$text =~ m#finance(.*?)end#s;
$text =~ m#<class>(.*?)</class>#s;
$text =~ m#/data(.*?)<end>#s;
If either one is matched, then print the result print $1
, and then continue with the rest of the text to match again for the three patterns.
H开发者_如何学Pythonow can I get the printed results in the order they appear in the whole text?
Many thanks for your help!
while ($text =~ m#(?: finance (.*?) end
| <class> (.*?) </class>
| data (.*?) </end>
)
#sgx) {
print $+;
}
ought to do it.
$+
is the last capturing group that successfully matched.
The /g
modifier is intended specifically for this kind of usage; it turns the regex into an iterator that, when resumed, continues the match where it left off instead of restarting at the beginning of $text
.
(And /x
lets you use arbitrary whitespace, meaning you can make your regexes readable. Or as readable as they get, at least.)
If you need to deal with multiple captures, it becomes a bit harder as you can't use $+
. You can, however, test for capturing groups being define
d:
while ($text =~ m#(?: a (.*?) b (.*?) c
| d (.*?) e (.*?) f
| data (.*?) </end>
)
#sgx) {
if (defined $1) {
# first set matched (don't need to check $2)
}
elsif (defined $3) {
# second set matched
}
else {
# final one matched
}
}
精彩评论