开发者

How to extract multiple patterns from a multi-line string

开发者 https://www.devze.com 2023-04-11 01:10 出处:网络
I have a string that looks like this. It\'s obvi开发者_运维百科ously a multi-line string and I would like to split it into one string per stanza.

I have a string that looks like this. It's obvi开发者_运维百科ously a multi-line string and I would like to split it into one string per stanza.

{
   "timestamp":1317911700,
   "application":"system.dev",
   "metrics":{
      "qlen":0,
      "read.bytes":0,
      "write.bytes":185165.0123762,
      "busy":0.021423
   },
   "dimensions":{
      "device":"sda"
   }
}

{
   "timestamp":1317911700,
   "application":"system.fs",
   "metrics":{
      "inodes.used":246627,
      "inodes.free":28703901,
      "capacity.kb":227927024,
      "available.kb":209528472,
      "used.kb":6820512
   },
   "dimensions":{
      "filesystem":"/"
   }
}

{
   "status_code":0,
   "application":"system",
   "status_msg":"Data collected successfully"
}

My regex looks like this:

/^({\n[^}]+^})/m

But I am only capturing:

{
   "status_code":0,
   "application":"system",
   "status_msg":"Data collected successfully"
}

Which kinda makes sense since that's where the first curly brace is. What I am trying to do is capture from anywhere there is a /^{/ to anywhere there is a /^}/ as a single string. But I think the other curly braces in there are tr


I can think of a few approaches.

  • There is an example somewhere in perlre on how you can implement a recursive pattern. This is hard. You need to take curlies in strings into account.

  • Text::Balanced already provides means of matching balanced parens (including curlies). This might be easier, because I think it can take curlies in strings into account.

  • It looks like you can simply split on blank lines.

    @json_snippets = split /^$/m, $json_snippets;
    
  • But the most reliable solution is to use JSON::XS's "incremental parser". (Search for that in its documentation.)


for my $stanza (split /^$/m, $str) {
  ...
}


If you can't use a JSON parser to properly do it, I would just split at the end of a stanza.

my @stanzas = split /^}\K\n\n/;
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号