I'm looking for a function, class or collection of functions that will assist in the pr开发者_如何转开发ocess of pattern matching strings as I have a project that requires a fair amount of pattern matching and I'd like something easier to read and maintain than raw preg_replace (or regex period).
I've provided a pseudo example in hopes that it will help you understand what I'm asking.
$subject = '$2,500 + $550 on-time bonus, paid 50% upfront ($1,250), 50% on delivery ($1,250 + on-time bonus).';
$pattern = '$n,nnn';
pattern_match($subject, $pattern, 0);
would return "$2,500".
$subject = '$2,500 + $550 on-time bonus, paid 50% upfront ($1,250), 50% on delivery ($1,250 + on-time bonus).';
$pattern = '$n,nnn';
pattern_match($subject, $pattern, 1);
would return an array with the values: [$2,500], [$1,250], [$1,250]
The function — as I'm trying to write — uses 'n' for numbers, 'c' for lower-case alpha and 'C' for upper-case alpha where any non-alphanumeric character represents itself.
Any help would be appreciated.
<?php
// $match_all = false: returns string with first match
// $match_all = true: returns array of strings with all matches
function pattern_match($subject, $pattern, $match_all = false)
{
$pattern = preg_quote($pattern, '|');
$ar_pattern_replaces = array(
'n' => '[0-9]',
'c' => '[a-z]',
'C' => '[A-Z]',
);
$pattern = strtr($pattern, $ar_pattern_replaces);
$pattern = "|".$pattern."|";
if ($match_all)
{
preg_match_all($pattern, $subject, $matches);
}
else
{
preg_match($pattern, $subject, $matches);
}
return $matches[0];
}
$subject = '$2,500 + $550 on-time bonus, paid 50% upfront ($1,250), 50% on delivery ($1,250 + on-time bonus).';
$pattern = '$n,nnn';
$result = pattern_match($subject, $pattern, 0);
var_dump($result);
$result = pattern_match($subject, $pattern, 1);
var_dump($result);
Here is the function with no regexp that should work ('C' and 'c' recognize only ascii chars) , enjoy:
function pattern_match($subject, $pattern, $result_as_array) {
$pattern_len = strlen($pattern);
if ($pattern_len==0) return false; // error: empty pattern
// translate $subject with the symboles of the rule ('n', 'c' or 'C')
$translate = '';
$subject_len = strlen($subject);
for ($i=0 ; $i<$subject_len ; $i++) {
$x = $subject[$i];
$ord = ord($x);
if ( ($ord>=48) && ($ord<=57) ) { // between 0 and 9
$translate .= 'n';
} elseif ( ($ord>=65) && ($ord<=90) ) { // between A and Z
$translate .= 'C';
} elseif ( ($ord>=97) && ($ord<=122) ) { // between a and z
$translate .= 'c';
} else {
$translate .= $x; // othre characters are not translated
}
}
// now search all positions in the translated string
// single result mode
if (!$result_as_array) {
$p = strpos($translate, $pattern);
if ($p===false) {
return false;
} else {
return substr($subject, $p, $pattern_len);
}
}
// array result mode
$result = array();
$p = 0;
$n = 0;
while ( ($p<$subject_len) && (($p=strpos($translate,$pattern,$p))!==false) ) {
$result[] = substr($subject, $p, $pattern_len);
$p = $p + $pattern_len;
}
return $result;
}
Update: This is an incomplete answer that doesn't hold up against several test patterns. See @Frosty Z's answer for a better solution.
<?php
function pattern_match($s, $p, $c=0) {
$tokens = array(
'$' => '\$',
'n' => '\d{1}',
'c' => '[a-z]{1}',
'C' => '[A-Z]{1}'
);
$reg = '/' . str_replace(array_keys($tokens), array_values($tokens), $p) . '/';
if ($c == 0) {
preg_match($reg, $s, $matches);
} else {
preg_match_all($reg, $s, $matches);
}
return $matches[0];
}
$subject = "$2,500 + $550 on-time bonus, paid 50% upfront ($1,250), 50% on delivery ($1,250 + on-time bonus).";
$pattern = '$n,nnn';
print_r(pattern_match($subject, $pattern, 0));
print_r(pattern_match($subject, $pattern, 1));
$pattern = 'cc-cccc';
print_r(pattern_match($subject, $pattern));
print_r(pattern_match($subject, $pattern, 1));
?>
Output:
$2,500
Array
(
[0] => $2,500
[1] => $1,250
[2] => $1,250
)
on-time
Array
(
[0] => on-time
[1] => on-time
)
Note: Make sure to use single-quotes for your $pattern
when it contains $
, or PHP will try to parse it as a $variable
.
The function you're looking for is preg_match_all, although you'll need to use REGEX patterns for your pattern matching.
Sorry, but this is a problem for regex. I understand your objections, but there's just no other way as efficient or simple in this case. This is an extremely simple matching problem. You could write a custom wrapper as jnpcl demonstrated, but that would only involve more code and more potential pitfalls. Not to mention extra overhead.
精彩评论