开发者

How to match the character '<' not followed by ('a' or 'em' or 'strong')?

开发者 https://www.devze.com 2022-12-27 21:15 出处:网络
How would I make a regular expression to match the character < not followed by (a or em开发者_如何学Go or strong)

How would I make a regular expression to match the character < not followed by (a or em开发者_如何学Go or strong)

So <hello and <string would match, but <strong wouldn't.


Try this:

<(?!a|em|strong)


You use a negative lookahead, the simplest form for which is (for this problem):

<(?!a|em|strong)

The one issue with that is that it will ignore <applet>. A way to deal with that is by using \b, which is a zero-width expression (meaning it captures none of the input) that matches a word to non-word or non-word to word transition. Word characters are [0-9a-zA-Z_]. So:

<(?!(a|em|strong)\b)


Although Andrew's answer is clearly superior, before, I also got it to work with [^(?:a|em|strong)].


If your regex engine supports it, use a negative lookahead assertion: this looks ahead in the string, and succeeds if it wouldn't match; however, it doesn't consume any input. Thus, you want /<(?!(?:a|em|strong)\b)/: match a <, then succeed if there isn't an a, em, or strong followed by a word break, \b.


function strip_tags(str, keep){
    if(keep && Array.isArray(keep)){keep = '|'+keep.join('|');}else if(keep){keep = '|'+keep;}else{keep = '';}
    return str.replace(new RegExp('<\/?(?![^A-Za-z0-9_\-]'+keep+').*?>', 'g'), '');
}

usage:

strip_tags('<html><a href="a">a</a> <strong>strong text</strong> and <em>italic text</em></html>', ['strong', 'em']);
//output: a <strong>strong text</strong> and <em>italic text</em>

I would also recommend you strip parameters from the tags you keep

function strip_params(str){
    return str.replace(/<((?:[A-Za-z0-9_\-])).*?>/g, '<$1>');
}
0

精彩评论

暂无评论...
验证码 换一张
取 消