开发者

Regex to deterime text 'http://...' but not in iframes, embeds...etc

开发者 https://www.devze.com 2023-04-06 17:17 出处:网络
This regex is used to replace text links with a clickable anchor tag. #(?<!href=\"|\">)((?:https?|ftp|nntp)://[^\\s<>()]+)#i

This regex is used to replace text links with a clickable anchor tag.

#(?<!href="|">)((?:https?|ftp|nntp)://[^\s<>()]+)#i

My problem is, I don't want it to change links that are in things like <iframe src="http//... or <embed src="http://...

I tried checking for a whitespace character before it by adding \s, but that didn't work.

Or - it appears they're first checking that an href=" doesn't already exist (?) - maybe I can check for the other things too?

Any thoughts / explanations how I would do this is greatly appreciated. Main, I just need the regex - I can implement in CakePHP myself.

The actual code comes from CakePHP's Text->autoLink():

function autoLinkUrls($text, $htmlOptions = array()) {
    $options = var_export($htmlOptions, true);
    $text = preg_replace_callback('#(?<!href="|">)((?:https?|f开发者_如何学运维tp|nntp)://[^\s<>()]+)#i', create_function('$matches',
        '$Html = new HtmlHelper(); $Html->tags = $Html->loadConfig(); return $Html->link($matches[0], $matches[0],' . $options . ');'), $text);
    return preg_replace_callback('#(?<!href="|">)(?<!http://|https://|ftp://|nntp://)(www\.[^\n\%\ <]+[^<\n\%\,\.\ <])(?<!\))#i',
        create_function('$matches', '$Html = new HtmlHelper(); $Html->tags = $Html->loadConfig(); return $Html->link($matches[0], "http://" . $matches[0],' . $options . ');'), $text);
}


You can expand the lookbehind at the beginning of those regexes to check for src=" as well as href=", like this:

(?<!href="|src="|">)

0

精彩评论

暂无评论...
验证码 换一张
取 消