开发者

stumped on clicking a link with nokogiri and mechanize

开发者 https://www.devze.com 2023-04-06 11:47 出处:网络
perhaps im doing it wrong, or there\'s another more efficient way. Here is my problem: I first, using nokogiri open an html document and use its css to traverse the document until i find the link whi

perhaps im doing it wrong, or there's another more efficient way. Here is my problem:

I first, using nokogiri open an html document and use its css to traverse the document until i find the link which i need to click.

Now once i have the link, how do i use mechanize to cli开发者_StackOverflow社区ck it? According to the documentation, the object returned by Mechanize.new either the string or a Mechanize::Page::Link object.

I cannot use string - since there could be 100's of the same link - i only want mechanize to click the link that was traversed by nokogiri.

Any idea?


After you have found the link node you need, you can create the Mechanize::Page::Link object manually, and click it afterwards:

agent = Mechanize.new
page = agent.get "http://google.com"
node = page.search ".//p[@class='posted']"
Mechanize::Page::Link.new(node, agent, page).click


Easier way than @binarycode option:

agent = Mechanize.new
page = agent.get "http://google.com"
page.link_with(:class => 'posted').click


That is simple, you don't need to use mechanize link_with().click

You can just getthe link and update your page variable

Mechanize saves current working site internally, so it is smart enough to follow local links

Ex.:

agent = Mechanize.new
page = agent.get "http://somesite.com"

next_page_link =  page.search('your exotic selectors here').first rescue nil  #nokogyri object 
next_page_href =  next_page_link['href'] rescue nil  # '/local/link/file.html'

page = agent.get(next_page_href) if next_page_href  # goes to 'http://somesite.com/local/link/file.html'
0

精彩评论

暂无评论...
验证码 换一张
取 消