开发者

Regex conditional

开发者 https://www.devze.com 2023-01-11 18:28 出处:网络
How would I write a RegEx to: Find a match where the f开发者_StackOverflow社区irst instance of a > character is before the first instance of a < character.

How would I write a RegEx to:

Find a match where the f开发者_StackOverflow社区irst instance of a > character is before the first instance of a < character.

(I am looking for bad HTML where the closing > initially in a line has no opening <.)


It's a pretty bad idea to try to parse html with regex, or even try to detect broken html with a regex.

What happens when there is a linebreak so that the > character is the first character on the line for example (valid html).

You might get some mileage from reading the answers to this question also: RegEx match open tags except XHTML self-contained tags


Would this work?

string =~ /^[^<]*>/

This should start at the beginning of the line, look for all characters that aren't an open '<' and then match if it finds a close '>' tag.


^[^<>]*>

if you need the corresponding < as well,

^[^<>]*>[^<]*<

If there is a possibility of tags before the first >,

^[^<>]*(?:<[^<>]+>[^<>]*)*>

Note that it can give false positives, e.g.

<!-- > -->

is a valid HTML, but the RegEx will complain.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号