just trying to rem开发者_开发知识库ove some elements with preg_replace but can't get it to work consistently. I would like to remove an element with matching class. Problem is the element may have an ID or several classes.
ie the element could be
<div id="me1" class="removeMe">remove me and my parent</div>
or
<div id="me1" class="removeMe" style="display:none">remove me and my parent</div>
is it possible to do this?
any help appreciated! Dan.
I agree with MarcB. Overall, it's better to use a DOM when manipulating HTML. But here is a regex based on smottt's answer that might work:
$html = preg_replace('~<div([^>]*)(class\\s*=\\s*["\']removeMe["\'])([^>]*)>(.*?)</div>~i', '', $html);
- Use
[^>]*
and[^<]*
instead of.*
. In my testing,.*?
doesn't work. If a non-matching div comes before a matching div, it will match the first div, everything in between, and the last div. For example, it incorrectly matches against this entire string:<div></div><b>hello</b><div class="removeMe">bar</div>
- Take into account the fact that you can use single quotes with HTML attributes.
- Also remember that there can be whitespace around the equals sign.
You should use the "m" modifier too so that it takes line breaks into account (see this page).
I added parenthesis for clarity, but they aren't needed. Let me know if this works or not.
EDIT: Actually, nevermind, the "m" modifier won't do anything. EDIT2: Improved the regex, but it still fails if there are any newlines in the div.
While this is still doable with regular expression, it's much simpler with e.g. QueryPath:
print qp($html)->find(".removeMe")->parent()->remove()->writeHTML();
With preg_replace:
preg_replace('~<div([^>]*)class="(.*?)gallery(.*?)">(.*?)</div>~im', '', $html);
精彩评论