开发者

Parsing whole table in Perl using TableExtract

开发者 https://www.devze.com 2023-02-14 13:36 出处:网络
I want to parse whole table using TableExtract in Pearl. This is what I wrote in Perl: use HTML::TableExtract;

I want to parse whole table using TableExtract in Pearl. This is what I wrote in Perl:

use HTML::TableExtract;
use LWP::Simple;
use Data::Dumper;

my $te  = new HTML::TableExtract( depth=>3, count=>3, gridmap=>0);
my $content = get("C:/Users/admin/Desktop/tabela.html");
$te->parse($content);

foreach $ts ($te->table_states)
{
    print $ts;
    foreach $row ($ts->rows)
    {
         print Dumper $row;
         #print Dumper $row if (scalar(@$row) == 2);
    }
}

and this is how file "tabela.html" looks:

<table width=100% align=center cellspacing=0 cellpadding=0 class='raspored_1x2'><tr class=svetlija><td align=center wid开发者_开发知识库th=20% >02.03.2011 20:30</td><td align=center width=5% >261</td><td align=right width=21% >AUSTRIA W.</td><td align=center width=2% >-</td><td align=left width=21% >STURM</td><td align=right width=8%>
            <a title="dodaj u tiket" href=?option=com_content&amp;task=view&amp;id=24&amp;Itemid=31&amp;sport=Fudbal&amp;a=add&amp;rb=261-5018-2011&amp;dom=AUSTRIA+W.&amp;gost=STURM&amp;tip=1&amp;kvota=1.80>1.80</a></td><td align=right width=8% >
            <a title="dodaj u tiket" href=?option=com_content&amp;task=view&amp;id=24&amp;Itemid=31&amp;sport=Fudbal&amp;a=add&amp;rb=261-5018-2011&amp;dom=AUSTRIA+W.&amp;gost=STURM&amp;tip=X&amp;kvota=3.30>3.30</a></td><td align=right width=8% >
            <a title="dodaj u tiket" href=?option=com_content&amp;task=view&amp;id=24&amp;Itemid=31&amp;sport=Fudbal&amp;a=add&amp;rb=261-5018-2011&amp;dom=AUSTRIA+W.&amp;gost=STURM&amp;tip=2&amp;kvota=3.90>3.90</a></td><td width=7%>
            <a title='Pogledaj kvote' href='javascript:void(0)' onclick="prikaziKvote('261-5018-2011')">
            <img src="http://www.balkanbet.co.rs/site/templates/balkanbet_green/images/arrow_down.gif" class='strelica'>
            </a>
            </td></tr></table>

When I run perl script nothing happen. Has anyone idea what is the problem?


#!/usr/bin/env perl
use warnings;
use strict;
use HTML::TableExtract;
use Data::Dumper;

my $content =<<EOC; 
<table width=100% align=center cellspacing=0 cellpadding=0 class='raspored_1x2'>
<tr class=svetlija>
<td align=center width=20% >02.03.2011 20:30</td>
<td align=center width=5% >261</td>
<td align=right width=21% >AUSTRIA W.</td>
<td align=center width=2% >-</td>
<td align=left width=21% >STURM</td>
<td align=right width=8%><a title="dodaj u tiket" href=?option=com_content&amp;task=view&amp;id=24&amp;Itemid=31&amp;sport=Fudbal&amp;a=add&amp;rb=261-5018-2011&amp;dom=AUSTRIA+W.&amp;gost=STURM&amp;tip=1&amp;kvota=1.80>1.80</a></td>
<td align=right width=8% ><a title="dodaj u tiket" href=?option=com_content&amp;task=view&amp;id=24&amp;Itemid=31&amp;sport=Fudbal&amp;a=add&amp;rb=261-5018-2011&amp;dom=AUSTRIA+W.&amp;gost=STURM&amp;tip=X&amp;kvota=3.30>3.30</a></td>
<td align=right width=8% ><a title="dodaj u tiket" href=?option=com_content&amp;task=view&amp;id=24&amp;Itemid=31&amp;sport=Fudbal&amp;a=add&amp;rb=261-5018-2011&amp;dom=AUSTRIA+W.&amp;gost=STURM&amp;tip=2&amp;kvota=3.90>3.90</a></td>
<td width=7%><a title='Pogledaj kvote' href='javascript:void(0)' onclick="prikaziKvote('261-5018-2011')"><img src="http://www.balkanbet.co.rs/site/templates /balkanbet_green/images/arrow_down.gif" class='strelica'></a></td>
</tr>
</table>
EOC

my $te = new HTML::TableExtract();

$te->parse( $content );

for my $ts ($te->table_states) { 
    print $ts; 
    for my $row ($ts->rows) { 
        print Dumper $row; 
        # print Dumper $row if (scalar(@$row) == 2); 
    } 
}

# HTML::TableExtract::Table=HASH(0x91e2e0)$VAR1 = [
#          '02.03.2011 20:30',
#          '261',
#          'AUSTRIA W.',
#          '-',
#          'STURM',
#          '1.80',
#          '3.30',
#          '3.90',
#          undef
#        ];


There may be other problems, but the first thing that springs to mind is that you are using LWP and you are passing it a file path, not a URL.

You probably want File::Slurp instead of LWP::Simple (note that it has a different API so you'll need to replace get())

0

精彩评论

暂无评论...
验证码 换一张
取 消