开发者

Regex to match part of string, when match does not contain a specific string - PCRE grep

开发者 https://www.devze.com 2023-02-02 09:51 出处:网络
I\'m using TextWrangler grep to perform find/replace on multiple files and have run into a wall with the last find/replace I need to perform. I need to match any text between \"> and the first inst

I'm using TextWrangler grep to perform find/replace on multiple files and have run into a wall with the last find/replace I need to perform. I need to match any text between "> and the first instance of a <br /> in a line but the match cannot contain the character sequence [xcol]. The regex flavor is Perl-Compatible (PCRE) so lookbehind needs to be fixed-length.

Example Text to Search:

<p class="x03">FooBar<br />Bar</p>
<p class="x03">FooBar [xcol]<br />Bar</p>
<p class="x06">Hello World<br />[xcol]foo[xcol]bar<br /></p>
<p class="x07">Hello World[xcol]<br />[xcol]foo[xcol]bar<br /></p>  

Desired behavior of regex:

1st Line match ">FooBar<br />

2nd Line no m开发者_如何学运维atch

3rd Line match ">Hello World<br />

4th Line no match

The text between "> and the <br /> will be captured in a group to be used with the replace function. The closest I got was using the following regex with negative lookahead, but this will not match the 3rd line as desired:

">((?!.*?\[xcol]).*?)<br />

Any help or advice is appreciated. Thank you.


Try this regex:

">((?!\[xcol]).)*<br\s*/>

A (short) explanation:

">               # match '">'
(                # start group 1
  (?!\[xcol]).   #   if '[xcol]' can't be seen ahead, match any character (except line breaks)
)                # end group 1
*                # repeat group 1 zero or more times
<br\s*/>         # match '<br />'

If you need to match line breaks for . as well, either enable DOT-ALL (add (?s) before the .) or replace the . with something like [\s\S]

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号