开发者

PHP - Strings - Remove a HTML tag with a specific class, including its contents

开发者 https://www.devze.com 2023-01-05 10:12 出处:网络
I have a string like this: <div class=\"container\"&g开发者_运维百科t; <h3 class=\"hdr\"> Text </h3>

I have a string like this:

<div class="container"&g开发者_运维百科t;
  <h3 class="hdr"> Text </h3>
  <div class="main">
    text
    <h3> text... </h3>
    ....

  </div>
</div>

how do I remove the H3 tag with the .hdr class using as little code as possible ?


Using as little code as possible? Shortest code isn't necessarily best. However, if your HTML h3 tag always looks like that, this should suffice:

$html = preg_replace('#<h3 class="hdr">(.*?)</h3>#', '', $html);

Generally speaking, using regex for parsing HTML isn't a particularly good idea though.


Something like this is what you're looking for...

$output = preg_replace("#<h3 class=\"hdr\">(.*?)</h3>#is", "", $input);

Use "is" at the end of the regex because it will cause it to be case insensitive which is more flexible.


try a preg_match, then a preg_replace on the following pattern:

/(<h3
[\s]+
[^>]*?
class=[\"\'][^\"\']*?hdr[^\"\']*?[\"\']
[^>]*?>
[\s\S\d\D\w\W]*?
<\/h3>)/i

It's messy, and it should work fine only if the h3 tag doesn't have inline javascript which might contain sequences that this regular expression will react to. It is far from perfect, but in simple cases where h3 tag is used it should work.

Haven't tried it though, might need adjustments.

Another way would be to copy that function, use your copy, without the h3, if it's possible.


This would help someone if above solutions dont work. It remove iframe and content having tag '-webkit-overflow-scrolling: touch;' like i had :)

RegEx, or regular expressions is code for what you would like to remove, and PHP function preg_replace() will remove all div or divs matching, or replacing them with something else. In the examples below, $incoming_data is where you put all your content before removing elements, and $result is the final product. Basically we are telling the code to find all divs with class=”myclass” and replace them with ” ” (nothing).

How to remove a div and its contents by class in PHP Just change “myclass” to whatever class your div has.

 $result = preg_replace('#<div class="myclass">(.*?)</div>#', ' ',
 $incoming_data);

How to remove a div and its contents by ID in PHP Just change “myid” to whatever ID your div has.

$result = preg_replace('#(.*?)#', ' ', $incoming_data);

If your div has multiple classes? Just change “myid” to whatever ID your div has like this.

$result = preg_replace('#<div id="myid(.*?)</div>#', ' ', $incoming_data);
or if div don’t have an ID, filter on the first class of the div like this.
$result = preg_replace('#<div class="myclass(.*?)</div>#', ' ', $incoming_data);

How to remove all headings in PHP This is how to remove all headings.

$result = preg_replace('#<h1>(.*?)</h1>#', ' ', $incoming_data);
and if the heading have a class, do something like this:
$result = preg_replace('#<h1 class="myclass">(.*?)</h1>#', ' ', $incoming_data);

Source: http://www.lets-develop.com/html5-html-css-css3-php-wordpress-jquery-javascript-photoshop-illustrator-flash-tutorial/php-programming/remove-div-by-class-php-remove-div-contents/


Stumbled upon this via Google - for anyone else feeling dirty using regex to parse HTML, here's a DOMDocument solution I feel much safer with going:

function removeTagByClass(string $html, string $className) {
    $dom = new \DOMDocument();
    $dom->loadHTML($html);
    $finder = new \DOMXPath($dom);

    $nodes = $finder->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' {$className} ')]");

    foreach ($nodes as $node) {
        $node->parentNode->removeChild($node);
    }

    return $dom->saveHTML();
}

Thanks to this other answer for the XPath query.


$content = preg_replace('~(.*?)~', '', $content);

Above code only works if the div haves are both on the same line. what if they aren't?

$content = preg_replace('~[^|]*?~', '', $content);

This works even if there is a line break in between but fails if the not so used | symbol is in between anyone know a better way?

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号