开发者

parsing source of a page to retrive table data and then export into xls

开发者 https://www.devze.com 2023-03-27 03:17 出处:网络
I have a need to dump the source of a page into a form, and have it spit out an xls file containing the contents of the page\'s tables.

I have a need to dump the source of a page into a form, and have it spit out an xls file containing the contents of the page's tables.

the page I wish to parse has several tables on it, of varying rows and 11 columns. Each table has a header, which I don't need. I have researched using DOM, but I couldn't figure out a way to use that object for my application. I thought about using preg_replace() as well, but again, since I am dealing with source code, I think that that开发者_JS百科 wont work.

Once I get the parse portion correct, I know how to write it to a xls file in php. I just cannot figure out how to go about this in php. Thanks in advance.

If it helps, this is what the table structure looks like for each table.

<table>
  <thead>
      <tr>
        <td>
        </td>
      </tr>
  </thead>
  <tbody>
      <tr>
        <td>
       </td>
     </tr>
 </tbody>
</table>


This should get you started at least

$doc = new DOMDocument();
$doc->loadHTML($htmlString);

// Get all tables bodies
$tables = $doc->getElementsByTagName('tbody');

foreach ($tables as $table) {
    $rows = $table->getElementsByTagName('tr');
    foreach ($rows as $row) {
        $cells = $row->getElementsByTagName('td');
        foreach ($cells as $cell) {
            $textContent = $cell->nodeValue;
        }
    }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消