This is a simple one :)
I have this line which works great:
$listing['biz_description'] = preg_replace('/<!--.*?--\>/','',$listing['biz_description']);
What is the proper regex to remove the html enti开发者_如何学编程ty version?
This is the entities:
<!-- -->
I would just decode the html entities if you are happy with the preg_replace regex you already have... html_entity_decode As @ircmaxell mentioned, using regex for html parsing can be very painfull.
$str = "This is a <!-- test --> of the emergency <!-- broadcast --> system";
$str = preg_replace('/<!--.*?--\>', '' ,html_entity_decode($str));
echo $str;
NEVER use regex to parse HTML/XML...
An implementation with DomDocument (assuming valid xml):
$dom = new DomDocument();
$dom->loadXml($listing['biz_description']);
removeComments($dom);
$listing['biz_description'] = $dom->saveXml();
function removeComments(DomNode $node) {
if ($node instanceof DomComment) {
$node->parentNode->removeChild($node);
} else {
foreach ($node->childNodes as $child) {
removeComments($child);
}
}
}
精彩评论