开发者

Parsing some results returned by nokogiri in ruby, getting an error message

开发者 https://www.devze.com 2022-12-25 06:41 出处:网络
The following code returns an error: require \'nokogiri\' require \'open-uri\' @doc = Nokogiri::HTML(open(\"http://www.amt.qc.ca/train/deux-montagnes/deux-montagnes.aspx\"))

The following code returns an error:

require 'nokogiri'
require 'open-uri'

@doc = Nokogiri::HTML(open("http://www.amt.qc.ca/train/deux-montagnes/deux-montagnes.aspx"))
#@doc = Nokogiri::HTML(File.open("deux-montagnes.html"))
stations =  @doc.xpath("//area")
stations.each { |station| str = station
    reg = /href="(.*)" title="(.*)"/
        href = reg.match(str)[1]
    title = reg.match(str)[2]
    page = /.*\/(.*).aspx$/.match(href)[1]
    puts href
    puts title
    puts page
    base_url = "http://www.amt.qc.ca"
    complete_url = base_url + href
    puts complete_url
}

ERROR:

station_names_from_map.rb:9:in `block in <main>': undefined method `[]' for nil:NilClass (NoMethodError)
        from /opt/local/lib/ruby1.9/gems/1.9.1/gems/nokogiri-1.4.1/lib/nokogiri/xml/node_set.rb:213:in `block in each开发者_运维问答'
        from /opt/local/lib/ruby1.9/gems/1.9.1/gems/nokogiri-1.4.1/lib/nokogiri/xml/node_set.rb:212:in `upto'
        from /opt/local/lib/ruby1.9/gems/1.9.1/gems/nokogiri-1.4.1/lib/nokogiri/xml/node_set.rb:212:in `each'
        from station_names_from_map.rb:7:in `<main>'

shell returned 1

While this code works:

str = '<area shape="poly" alt="Deux-Montagnes" coords="59,108,61,106,65,106,67,108,67,113,65,115,61,115,59,113" href="/train/deux-montagnes/deux-montagnes.aspx" title="Deux-Montagnes">'

reg = /href="(.*)" title="(.*)"/
href = reg.match(str)[1]
title = reg.match(str)[2]
page = /.*\/(.*).aspx$/.match(href)[1]
puts href
puts title
puts page
base_url = "http://www.amt.qc.ca"
complete_url = base_url + href
puts complete_url

Any reason why?


As Chuck pointed in his comments, the problem is the match:

href = reg.match(str)[1]

it gets evaluated at some point of the cycle as:

nil[1]

and throws the error, change your match(es) with something like:

href_match = reg.match(str)
href = '' # ignore this line if you don't need a default value, but you're ok with a nil
href = href_match[1] unless href.nil?


|station| str = station.to_s

This solves the problem, because station is actually a Nokogiri::XML::Element instance.

0

精彩评论

暂无评论...
验证码 换一张
取 消