Using simplehtmldom to grab a text snippet_问答_开发者

Using simplehtmldom to grab a text snippet

开发者 https://www.devze.com 2023-02-09 23:54 出处：网络

I\'m trying to use the simplehtmldom script to get at some text. The HTML structure is as follows <div id=\"posts\">

相关专题：dom php

I'm trying to use the simplehtmldom script to get at some text. The HTML structure is as follows

<div id="posts">
  <div align="center">
    <SEVERAL LEVELS OF HTML>
      <strong>XXX</strong>
    </SEVERAL LEVELS OF HTML>
  </div>
  <div align="center">
    <SEVERAL LEVELS OF HTML>
      <strong>IGNORE</strong>
    </SEVERAL LEVELS OF HTML>
  </div>
  <div align="center">
    <SEVERAL LEVELS OF HTML>
开发者_StackOverflow      <strong>IGNORE</strong>
    </SEVERAL LEVELS OF HTML>
  </div>
</div>

The text I'm trying to get at is the XXX string, in the first <strong> tags inside the first <div> with attribute align="center", which is inside the <div> with id="posts". I'm not interested in the text in <div align="center"> tags further down.

The "several levels of HTML" include messy nested tables etc.

My code: I'm using descendant selectors and obviously I'm "skipping" through the several levels of html. Is this the reason why my print_r shows "Trying to get property of non-object"?

$html = file_get_html($page_1);
$es = $html->find('div#posts div[align=center] strong');
print_r($es->plaintext); die;

Strangely enough this statement also returns the same "Trying to get property of non-object" result. What am I doing wrong?

$es = $html->find('div#posts');

2 possible reasons :

In $html = file_get_html($page_1);, $page_1 may not be a URL. If it's a string containing html use str_get_html as in $html = str_get_html('<div id="hello">Hello</div><div id="world">World</div>'); instead.
The html contains more than one div#posts (which shouldn't).