开发者

PHP Dom not retrieving element

开发者 https://www.devze.com 2022-12-23 05:43 出处:网络
$code = \' <h1>Galeria </h1> <div class=\"galeria\"> <ul id=\"galeria_list\"> <li>
$code = '
<h1>Galeria </h1>

<div class="galeria">
    <ul id="galeria_list">
        <li>
          <img src="img.jpg" width="350" height="350" />
          <br />
          Teste
        </li>
    </ul>
</div>';


$dom = new DOMDocument;
$dom->validateOnParse = true;

$dom->loadHTML($code);

var_dump($dom->getElementById('galeria_list'));

The var_dump always returns NULL. Anyone know why? I can clearly see the element with the id galeria_list in $cod开发者_开发技巧e. Why is this not getting the element?

And also, does anyone know how to prevent the domdocument from adding the <html> and <body> tags on the saveHTML method?

Thanks


It seems that loadhtml() does not "attach" the html dtd that defines id as an id-attribute to the DOM. But if the html document contains a DOCTYPE declaration it works as intended. (But my guess is you don't want to add a doctype and html skeleton, anyway:).

$code = '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head><title>...</title></head>
<body>
  <h1>Galeria </h1>
  <div class="galeria">
    <ul id="galeria_list">
      <li>
        <img src="img.jpg" width="350" height="350" />
        <br />
        Teste
      </li>
    </ul>
  </div>
</body></html>';

$dom = new DOMDocument;
$dom->loadhtml($code);
var_dump($dom->getElementById('galeria_list'));


It seems that the DOMDocument will not play nice with HTML fragments. You may want to either consider the DOMDocumentFragment (as dnagirl suggests) or consider extending the DOMDocument.

After a little research, I've put together a simple extension that will achieve what you are asking:

class MyDOMDocument extends DOMDocument {

    function getElementById($id) {

        //thanks to: http://www.php.net/manual/en/domdocument.getelementbyid.php#96500
        $xpath = new DOMXPath($this);
        return $xpath->query("//*[@id='$id']")->item(0);
    }

    function output() {

        // thanks to: http://www.php.net/manual/en/domdocument.savehtml.php#85165
        $output = preg_replace('/^<!DOCTYPE.+?>/', '',
                str_replace( array('<html>', '</html>', '<body>', '</body>'),
                        array('', '', '', ''), $this->saveHTML()));

        return trim($output);

    }

}

Usage

$dom = new MyDOMDocument();
$dom->loadHTML($code);

var_dump($dom->getElementById("galeria_list"));

echo $dom->output();


You might consider DOMDocumentFragment rather than DOMDocument if you don't want headers.

As for the id problem, this is from the manual:

<?php

$doc = new DomDocument;

// We need to validate our document before refering to the id
$doc->validateOnParse = true;
$doc->Load('book.xml');

echo "The element whose id is books is: " . $doc->getElementById('books')->tagName . "\n";

?> 

validateOnParse is likely the issue.


Someone worked around this problem in the PHP manual by using XPath: http://us3.php.net/manual/en/domdocument.getelementbyid.php#96500

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号