开发者

Xpath not finding elements past the first

开发者 https://www.devze.com 2023-02-15 08:07 出处:网络
I\'m working on a scraper using xpath, but xpath seems inexplicably incapable of retreiving the informati开发者_StackOverflowon that I need.I\'ve been able to get the below code to print out the table

I'm working on a scraper using xpath, but xpath seems inexplicably incapable of retreiving the informati开发者_StackOverflowon that I need. I've been able to get the below code to print out the table element and all of its contents, but as soon as I try to go to the tbody or tr elements, it starts returning None. You can see the url below as well.

I've used XPather in Firefox to confirm that the below is correct, but for some reason the path fails once put into Python.

url = 'http://www.arkleg.state.ar.us/assembly/2011/2011R/pages/CommitteeDetail.aspx?committeecode=000'

with self.urlopen(url) as page:
    page = lxml.html.fromstring(page)

    for tr in page.xpath('//table[@class="gridtable"]/tbody/tr'):
        print tr.xpath('string(td[1])')


Firefox adds the implicit tbody inside the table element, but this doesn't exist in the source HTML for that page. This XPATH should work to find all the tr tags:

for node in page.xpath('.//table[@class="gridtable"]/tr'):
0

精彩评论

暂无评论...
验证码 换一张
取 消