开发者

Php screen scraping using php simple dom parser

开发者 https://www.devze.com 2023-02-04 06:40 出处:网络
I am using simple html dom parser to scrape a website ... How can i skip a 开发者_开发技巧particular class while in a loop Judging from http://simplehtmldom.sourceforge.net/manual.htm#frag_find_attr y

I am using simple html dom parser to scrape a website ... How can i skip a 开发者_开发技巧particular class while in a loop


Judging from http://simplehtmldom.sourceforge.net/manual.htm#frag_find_attr you can use:

->find("div[class!=skip_me]")

Or use the DOM methods and check with ->getAttribute("class") against a value.


  // DOM can load HTML soup. But, HTML soup can throw warnings, suppress
  // them.
  $htmlDom = new DOMDocument();
  @$htmlDom->loadHTML($html);
  if ($htmlDom) {
    // It's much easier to work with simplexml than DOM, luckily enough
    // we can just simply import our DOM tree.
    $elements = simplexml_import_dom($htmlDom);

This is a quote (almost) from Drupal 7 SimpleTest. After this, it's a lot easier work with the document, the class can be reach as $element['class']

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号