开发者

Using OR (|) with PHP Regex when ORing two expressions

开发者 https://www.devze.com 2023-03-29 23:50 出处:网络
I\'m trying to combine two regular expressions with an OR condition in PHP so that two different string patterns can be found with one pass.

I'm trying to combine two regular expressions with an OR condition in PHP so that two different string patterns can be found with one pass.

I have this pattern [\$开发者_运维问答?{[_A-Za-z0-9-]+[:[A-Za-z]*]*}] which matches strings like this ${product} and ${Product:Test}.

I have this pattern [<[A-Za-z]+:[A-Za-z]+\s*(\s[A-Za-z]+=\"[A-Za-z0-9\s]+\"){0,5}\s*/>] which matches strings like this <test:helloWorld /> and <calc:sum val1="10" val2="5" />.

However when I try to join the two patterns into one

[\$?{[_A-Za-z0-9-]+[:[A-Za-z]*]*}]|[<[A-Za-z]+:[A-Za-z]+\s*(\s[A-Za-z]+=\"[A-Za-z0-9\s]+\"){0,5}\s*/>]

so I can find all the matching strings with one call to

preg_match_all(REGEX_COMBINED, $markup, $results, PREG_SET_ORDER);

I get the following error message Unknown modifier '|'.

Can anyone please tell me where I am going wrong, I've tried multiple variations of the pattern but nothing I do seems to work.

Thanks


In PHP, regexes have to be enclosed in delimiters, like /abc/ or ~abc~. Almost any ASCII punctuation character will do; it just has to be the same character at both ends in most cases. The exception is when you use "bracketing" characters like () and <>; then they have to be correctly paired.

With your original regexes, the square brackets were being used as regex delimiters. After you glued them together it no longer worked because the compiler was still trying to use the first ] as the closing delimiter.

Another problem is that you're trying to use square brackets for grouping, which is wrong; you use parentheses for that. If you look below you'll see that I replaced square brackets with parentheses where needed, but the outermost pair I simple dropped; grouping isn't needed at that level. Then I added ~ to serve as the regex delimiter. I also added the i modifier and got rid of some clutter.

~\$?\{[\w-]+(?::[a-z]*)*\}~i

~<[a-z]+:[a-z]+\s*(?:\s[a-z]+=\"[a-z\d\s]+\"){0,5}\s*/>~i

To combine the regexes, just remove the ending ~i from the first regex and the opening ~ from the second, and replace them with a pipe:

~\$?\{[\w-]+(?::[a-z]*)*\}|<[a-z]+:[a-z]+\s*(?:\s[a-z]+=\"[a-z\d\s]+\"){0,5}\s*/>~i


Try wrapping the two conditions in an outer set of brackets "(...|...)":

([\$?{[_A-Za-z0-9-]+[:[A-Za-z]*]*}]|[<[A-Za-z]+:[A-Za-z]+\s*(\s[A-Za-z]+=\"[A-Za-z0-9\s]+\"){0,5}\s*/>])

Tested here and it seemed to work

0

精彩评论

暂无评论...
验证码 换一张
取 消