开发者

Simple HTML Dom

开发者 https://www.devze.com 2023-02-16 22:28 出处:网络
Thanks for taking the time to read my post... I\'m trying to extract some information from my website using Simple HTML Dom...

Thanks for taking the time to read my post... I'm trying to extract some information from my website using Simple HTML Dom...

I have it reading from the HTML source ok, now I'm just trying to extract the information that I need. I have a feeling I'm going about this in the wrong way... Here's my script...

<?php

include_once('simple_html_dom.php');

// create doctype
$dom = new DOMDocument("1.0");

// display document in browser as plain text
// for readability purposes
//header("Content-Type: text/plain开发者_开发问答");

// create root element
$xmlProducts = $dom->createElement("products");
$dom->appendChild($xmlProducts);

$html = file_get_html('http://myshop.com/small_houses.html');
$html .= file_get_html('http://myshop.com/medium_houses.html');
$html .= file_get_html('http://myshop.com/large_houses.html');

    //Define my variable for later
    $product['image'] = '';
    $product['title'] = '';
    $product['description'] = '';

foreach($html->find('img') as $src){

    if (strpos($src->src,"http://myshop.com") === false) {
        $src->src = "http://myshop.com/$src->src";
    }
       $product['image'] = $src->src;
}

foreach($html->find('p[class*=imAlign_left]') as $description){
       $product['description'] =  $description->innertext;
}

foreach($html->find('span[class*=fc3]') as $title){
       $product['title'] =  $title->innertext;
}

echo $product['img'];
echo $product['description'];
echo $product['title'];

?>

I put echo's on the end for sake of testing...but I'm not getting anything... Any pointers would be a great HELP!

Thanks

Charles


file_get_html() returns a HTMLDom Object, and you cannot concatenate Objects, although HTMLDom have __toString methods when there concatenated there more then lilly corrupt in some way, try the following:

<?php

include_once('simple_html_dom.php');

// create doctype
$dom = new DOMDocument("1.0");

// display document in browser as plain text
// for readability purposes
//header("Content-Type: text/plain");

// create root element
$xmlProducts = $dom->createElement("products");
$dom->appendChild($xmlProducts);

$pages = array(
    'http://myshop.com/small_houses.html',
    'http://myshop.com/medium_houses.html',
    'http://myshop.com/large_houses.html'
)


foreach($pages as $page)
{
    $product = array();
    $source = file_get_html($page);

    foreach($source->find('img') as $src)
    {
        if (strpos($src->src,"http://myshop.com") === false)
        {
            $product['image'] = "http://myshop.com/$src->src";
        }
    }

    foreach($source->find('p[class*=imAlign_left]') as $description)
    {
        $product['description'] =  $description->innertext;
    }

    foreach($source->find('span[class*=fc3]') as $title)
    {
        $product['title'] =  $title->innertext;
    }

    //debug perposes!

    echo "Current Page: " . $page . "\n";
    print_r($product);
    echo "\n\n\n"; //Clear seperator
}
?>
0

精彩评论

暂无评论...
验证码 换一张
取 消