I'm looking for a way to f开发者_开发问答ind/replace links to images (within user-generated content) without touching links to non-images.
For example, the following text:
<a href="http://domain.com/arbitrary-file.jpg">Text</a>
<a href="http://domain.com/arbitrary-file.jpeg">Text</a>
<a href="http://domain.com/arbitrary-path/arbitrary-file.gif">Text</a>
<a href="http://domain.com/arbitrary-file.png">Text</a>
<a href="http://domain.com/arbitrary-file.html">Text</a>
<a href="http://domain.com/arbitrary-path/">Text</a>
<a href="http://domain.com/arbitrary-file#anchor_to_here">Text</a>
Non-hyperlinked URL: http://domain.com/arbitrary-path/arbitrary-file.gif
Non-hyperlinked URL: http://domain.com/arbitrary-file#anchor_to_here
... should be revised to:
<img src="http://domain.com/image.jpg" alt="Text" />
<img src="http://domain.com/arbitrary-file.jpeg" alt="Text" />
<img src="http://domain.com/arbitrary-path/arbitrary-file.gif" alt="Text" />
<img src="http://domain.com/arbitrary-file.png" alt="Text" />
<a href="http://domain.com/arbitrary-file.html">Text</a>
<a href="http://domain.com/arbitrary-path/">Text</a>
<a href="http://domain.com/arbitrary-file.html#anchor_to_here">Text</a>
Non-hyperlinked URL: http://domain.com/arbitrary-path/arbitrary-file.gif
Non-hyperlinked URL: http://domain.com/arbitrary-file#anchor_to_here
... securely and reliably in PHP.
You might want to look at using a HTML parser (rather than regular expressions, as you tagged the submission) such as the PHP Simple HTML DOM Parser. This would provide you with the reliability you speak of.
You'll probably end up with something like this:
foreach($html->find('a') as $element)
{
echo '<img src="'.$element->href.'" alt="'.$element->innertext.'" />';
}
There's no reliable way to do this, not at least with regular expressions, but this should do the trick nevertheless:
$str = preg_replace('~<a[^>]*?href="(.*?(gif|jpeg|jpg|png))".*?</a>~', '<img src="$1" />', $str);
To open this up a bit:
- Find opening
<a
tags - Find the href attribute inside that tag
- Get the href if it ends with one of the listed file extensions and a
"
character - Include the rest of the link until the closing
</a>
tag in the replace - Replace the whole match with an img element that gets the href as a src attribute
As Bauer noted, you could be better off using DOM methods. But if you can be sure your links are always in this format, you can use regular expressions. Regex might be a bit faster also.
精彩评论