I want to match strings like those below.
abc|q:1,f:2
cba|q:1,f:awd2,t:3awd,h:gr
I am using php and have tried both preg_match
and preg_match_all
with this expression.
/^([a-z]+)\|([a-z]+:[a-z0-9]+,?)+$/iU
开发者_StackOverflow中文版
This only returns the first part before the pipe, and one a:1. What am I doing wrong, why is it behaving this way and how can I make it work?
/^([a-z]+)\|((?:[a-z]+:[a-z0-9]+,?)+)$/iU
would capture:
- the part before the pipe
- the part after the part
The greedy nature of the '+' quantifier make your capturing group ([a-z]+:[a-z0-9]+,?) only capture the last set of characters matching this regex.
/(?ms)^((?:[a-z]+)\|(?:[a-z]+:[a-z0-9]+,?)+)$/iU
would capture the all line.
Note the '?:
' to avoid creating any capturing group.
I just tried:
<?php
$string = 'cba|q:1,f:awd2,t:3awd,h:gr';
$subpat = '[a-z]+:[a-z0-9]+';
$pat = "/^([a-z]+)\|($subpat(?:,$subpat)+)$/i";
preg_match( $pat, $string, $matches );
print_r( $matches );
?>
which yields
Array
(
[0] => cba|q:1,f:awd2,t:3awd,h:gr
[1] => cba
[2] => q:1,f:awd2,t:3awd,h:gr
)
At this point you have the part before the vertical bar in matches[1]
and the rest in matches[2]
. The repetition of $subpat
is there to ensure the strings to be properly separated by commas. After that, apply explode on matches[2]
.
$string = 'cba|q:1,f:awd2,t:3awd,h:gr';
$re = '~(?: ^(\w+)\| ) | (?: (\w+) : (\w+) (?:,|$) )~x';
preg_match_all($re, $string, $m, PREG_SET_ORDER);
var_dump($m);
this will match the part before the pipe ("lead") and all key-value pairs at once. "lead" will be in $m[0][1]
and key-values will be in $m[1..x][2] and [3]
. Add some simple post-processing to convert this to a usable form, for example:
$lead = $m[0][1];
foreach(array_slice($m, 1) as $p)
$data[$p[2]] = $p[3];
var_dump($lead, $data);
精彩评论