开发者

A pattern-matching function in PHP

开发者 https://www.devze.com 2023-02-20 22:25 出处:网络
I\'m looking for a function, class or collection of functions that will assist in the pr开发者_如何转开发ocess of pattern matching strings as I have a project that requires a fair amount of pattern ma

I'm looking for a function, class or collection of functions that will assist in the pr开发者_如何转开发ocess of pattern matching strings as I have a project that requires a fair amount of pattern matching and I'd like something easier to read and maintain than raw preg_replace (or regex period).

I've provided a pseudo example in hopes that it will help you understand what I'm asking.

$subject = '$2,500 + $550 on-time bonus, paid 50% upfront ($1,250), 50% on delivery ($1,250 + on-time bonus).';
$pattern = '$n,nnn';
pattern_match($subject, $pattern, 0);

would return "$2,500".

$subject = '$2,500 + $550 on-time bonus, paid 50% upfront ($1,250), 50% on delivery ($1,250 + on-time bonus).';
$pattern = '$n,nnn';
pattern_match($subject, $pattern, 1);

would return an array with the values: [$2,500], [$1,250], [$1,250]

The function — as I'm trying to write — uses 'n' for numbers, 'c' for lower-case alpha and 'C' for upper-case alpha where any non-alphanumeric character represents itself.

Any help would be appreciated.


<?php

// $match_all = false: returns string with first match
// $match_all = true:  returns array of strings with all matches

function pattern_match($subject, $pattern, $match_all = false)
{
  $pattern = preg_quote($pattern, '|');

  $ar_pattern_replaces = array(
      'n' => '[0-9]',
      'c' => '[a-z]',
      'C' => '[A-Z]',
    );

  $pattern = strtr($pattern, $ar_pattern_replaces);

  $pattern = "|".$pattern."|";

  if ($match_all)
  {
    preg_match_all($pattern, $subject, $matches);
  }
  else
  {
    preg_match($pattern, $subject, $matches);
  }

  return $matches[0];
}

$subject = '$2,500 + $550 on-time bonus, paid 50% upfront ($1,250), 50% on delivery ($1,250 + on-time bonus).';
$pattern = '$n,nnn';

$result = pattern_match($subject, $pattern, 0);
var_dump($result);

$result = pattern_match($subject, $pattern, 1);
var_dump($result);


Here is the function with no regexp that should work ('C' and 'c' recognize only ascii chars) , enjoy:

function pattern_match($subject, $pattern, $result_as_array) {

    $pattern_len = strlen($pattern);
    if ($pattern_len==0) return false; // error: empty pattern

    // translate $subject with the symboles of the rule ('n', 'c' or 'C')
    $translate = '';
    $subject_len = strlen($subject);
    for ($i=0 ; $i<$subject_len ; $i++) {
        $x = $subject[$i];
        $ord = ord($x);
        if ( ($ord>=48) && ($ord<=57) ) { // between 0 and 9
            $translate .= 'n';
        } elseif ( ($ord>=65) && ($ord<=90) ) { // between A and Z
            $translate .= 'C';
        } elseif ( ($ord>=97) && ($ord<=122) ) { // between a and z
            $translate .= 'c';
        } else {
            $translate .= $x; // othre characters are not translated
        }
    }

    // now search all positions in the translated string

    // single result mode
    if (!$result_as_array) {
        $p = strpos($translate, $pattern);
        if ($p===false) {
            return false;
        } else {
            return substr($subject, $p, $pattern_len);
        }
    }

    // array result mode
    $result = array();
    $p = 0;
    $n = 0;
    while ( ($p<$subject_len)  && (($p=strpos($translate,$pattern,$p))!==false) ) {
        $result[] = substr($subject, $p, $pattern_len);
        $p = $p + $pattern_len;
    }
    return $result;

}


Update: This is an incomplete answer that doesn't hold up against several test patterns. See @Frosty Z's answer for a better solution.

<?php
    function pattern_match($s, $p, $c=0) {
        $tokens = array(
            '$' => '\$',
            'n' => '\d{1}',
            'c' => '[a-z]{1}',
            'C' => '[A-Z]{1}'
        );
        $reg = '/' . str_replace(array_keys($tokens), array_values($tokens), $p) . '/';
        if ($c == 0) {
            preg_match($reg, $s, $matches);
        } else {
            preg_match_all($reg, $s, $matches);
        }
        return $matches[0];
    }

    $subject = "$2,500 + $550 on-time bonus, paid 50% upfront ($1,250), 50% on delivery ($1,250 + on-time bonus).";

    $pattern = '$n,nnn';
    print_r(pattern_match($subject, $pattern, 0));
    print_r(pattern_match($subject, $pattern, 1));

    $pattern = 'cc-cccc';
    print_r(pattern_match($subject, $pattern));
    print_r(pattern_match($subject, $pattern, 1));
?>

Output:

$2,500

Array
(
    [0] => $2,500
    [1] => $1,250
    [2] => $1,250
)

on-time

Array
(
    [0] => on-time
    [1] => on-time
)

Note: Make sure to use single-quotes for your $pattern when it contains $, or PHP will try to parse it as a $variable.


The function you're looking for is preg_match_all, although you'll need to use REGEX patterns for your pattern matching.


Sorry, but this is a problem for regex. I understand your objections, but there's just no other way as efficient or simple in this case. This is an extremely simple matching problem. You could write a custom wrapper as jnpcl demonstrated, but that would only involve more code and more potential pitfalls. Not to mention extra overhead.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号