开发者

Parsing XML with Nokogiri

开发者 https://www.devze.com 2023-03-03 10:29 出处:网络
I can not figure out how to parse the \"author\" and \"fact\" tags out of the following XML.If the formatting looks strange here is a link to the XML doc.

I can not figure out how to parse the "author" and "fact" tags out of the following XML. If the formatting looks strange here is a link to the XML doc.

<response stat="ok">
−<ltml version="1.1">
   −<item id="5403381" type="work">
      <author id="21" authorcode="rowlingjk">J. K. Rowling</author>
      <url>http://www.librarything.com/work/5403381</url>
     −<commonknowledge>
     −<fieldList>
     −<field type="42" name="alternativetitles" displayName="Alternate titles">
     −<versionList>
     −<version id="3413291" archived="0" lang="eng">
         <date timestamp="1298398701">Tue, 22 Feb 2011 13:18:21 开发者_运维知识库-0500</date>
         −<person id="18138">
             <name>ablachly</name>
             <url>http://www.librarything.com/profile/ablachly</url>
          </person>
         −<factList>
              <fact>Harry Potter and the Sorcerer's Stone </fact>
           </factList>
              </version>
       </versionList>
      </field>

So far I have tried this code to get the author but it does not work:

@xml_doc = Nokogiri::XML(open("http://www.librarything.com/services/rest/1.1/?method=librarything.ck.getwork&isbn=0590353403&apikey=d231aa37c9b4f5d304a60a3d0ad1dad4"))

@xml_doc.xpath('//response').each do |n|
    @author = n      
end


I couldn't get at any nodes deeper than //response using the link you provided. I ended up using Nokogiri::XML::Reader and pushing elements into a hash, since there may be multiple authors, and there are definitely multiple facts. You can use whatever data structure you like, but this gets the content of the fact and author tags:

require 'nokogiri'
require 'open-uri'

url = "http://www.librarything.com/services/rest/1.1/?method=librarything.ck.getwork&isbn=0590353403&apikey=d231aa37c9b4f5d304a60a3d0ad1dad4"
reader = Nokogiri::XML::Reader(open(url))

book = {
  author: []
  fact: []
}

reader.each do |node|
  book.each do |k,v|
    if node.name == k.to_s && !node.inner_xml.empty?
      book[k] << node.inner_xml
    end
  end
end


You could try:

nodes = @xml_doc.xpath("//xmlns:author", "xmlns" => "http://www.librarything.com/")
puts nodes[0].inner_text

nodes = @xml_doc.xpath("//xmlns:fact", "xmlns" => "http://www.librarything.com/")
nodes.each do |n|
   puts n.inner_text
end

The trick is in the namespace.

0

精彩评论

暂无评论...
验证码 换一张
取 消