HTML code strip regexp problem_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-01-14 06:56 出处：网络

In javascript, one of the popular regex is to strip out HTML tags from the text. The code for that is

String.prototype.stripHTML = function () { 
             var reTag = /<(?:.|\s)*?>/g; 
             return this.replace(reTag, "");
        };

If you try this on "This would be bold".stripHTML(), then it outputs as "This would be bold". Shouldn't it output as "" ?

Doesn't this regex says that match eve开发者_高级运维rything which starts with < and ends with > ? Why didn't this regex start at < of  and end at > of 

You are using a non-greedy modifier.

(?:.|\s)*?
         ^

This causes the match to be the shortest possible, instead of the default which is to match the longest possible match.

<b>This would be bold</b>
^-^                  ^--^     Non-greedy: <(?:.|\s)*?>
^-----------------------^     Greedy    : <(?:.|\s)*>

Yes, but the *? performs an ungreedy match (short match):

var reTag = /<(?:.|\s)*?>/g;

To perform reedy match (longest match possible), remove the ?:

var reTag = /<(?:.|\s)*>/g;

It's not a greedy regex, meaning that it matches the first > it comes across, the  and  are separate matches.

HTML code strip regexp problem

精彩评论

关注公众号

热门标签

图文推荐

HTML code strip regexp problem

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：