开发者

Selectively encoding HTML, how?

开发者 https://www.devze.com 2023-03-07 17:12 出处:网络
Allow me to explain my problem by before and after... I have a comment system on a web community. Users can type in anything they want in a textarea, including spec开发者_如何学Goial characters and H

Allow me to explain my problem by before and after...

I have a comment system on a web community. Users can type in anything they want in a textarea, including spec开发者_如何学Goial characters and HTML tags. In MySQL, I store the comment body exactly as typed, without any intervention. However, upon display I use HTML entities to prevent users from messing with HTML:

<?= nl2br(htmlentities($comment['body'], ENT_QUOTES, 'UTF-8')) ?>

This is working fine. However, I am now trying to enrich the comment system by automatically converting some links that are placed inside comments into richer objects. This concerns a photo forum and sometimes users make references to other photos by pasting in URLs in the comments:

'http://www.jungledragon.com/image/12/eagle.html

Using regular expressions, I am replacing valid links like the above into markup. In this case, it would be replaced with an img tag so that instead of a link, users see a thumb of that image directly inline in the comment.

The replacement is working fine. However, since I am using htmlentities, the replacement markup will render as text, rather than a rendered image. No surprises here.

My question is, how can I selectively html encode a comment body? I want these links replacements to not be escaped, but everything else should be escaped.


Do the htmlentities first and the replacing afterwords.


Usually, you'd use a library to sanitize the HTML instead. A few are listed here:

http://htmlpurifier.org/comparison

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号