开发者

how to match several regular expression patterns sequentially in perl

开发者 https://www.devze.com 2023-02-17 18:32 出处:网络
I want to do matching in the following way for a large multiline text: I have a few matching patterns:

I want to do matching in the following way for a large multiline text:

I have a few matching patterns:

$text =~ m#finance(.*?)end#s;

$text =~ m#<class>(.*?)</class>#s;

$text =~ m#/data(.*?)<end>#s;

If either one is matched, then print the result print $1, and then continue with the rest of the text to match again for the three patterns.

H开发者_如何学Pythonow can I get the printed results in the order they appear in the whole text?

Many thanks for your help!


while ($text =~ m#(?: finance (.*?) end
                  |   <class> (.*?) </class>
                  |   data    (.*?) </end>
                  )
                 #sgx) {
  print $+;
}

ought to do it.

$+ is the last capturing group that successfully matched.

The /g modifier is intended specifically for this kind of usage; it turns the regex into an iterator that, when resumed, continues the match where it left off instead of restarting at the beginning of $text.

(And /x lets you use arbitrary whitespace, meaning you can make your regexes readable. Or as readable as they get, at least.)

If you need to deal with multiple captures, it becomes a bit harder as you can't use $+. You can, however, test for capturing groups being defined:

while ($text =~ m#(?: a (.*?) b (.*?) c
                  |   d (.*?) e (.*?) f
                  |   data      (.*?) </end>
                  )
                 #sgx) {
  if (defined $1) {
    # first set matched (don't need to check $2)
  }
  elsif (defined $3) {
    # second set matched
  }
  else {
    # final one matched
  }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消