Let's say I have this in a document:
<entry>
<link rel="replies" type="application/atom+xml" href="http://www.url.com/feeds/1/comments/default" title="Comments"/>
<link rel="alternate" type="text/html" href="http://www.url.com/a_blog_post.html" title="A Blog Post"/>
</entry>
<entry>
<link rel="replies" type="application/atom+xml" href="http://www.url.com/feeds/2/comments/default" title="Comments"/>
<link rel="alternate" type="text/html" href="http://www.url.com/another_blog_post.html" title="Another Blog Post"/>
</entry>
I am trying to use Nokogiri to pull the urls for each of the blog posts, but I am apparently going about it all wrong (I'开发者_StackOverflowm new to programming and having trouble understanding nokogiri)
Here's what I have:
require 'nokogiri'
require 'open-uri'
def get_posts(url)
posts = []
doc = Nokogiri::HTML(open(url))
doc.css('entry.alternate').each do |e|
puts e['href']
posts << e['href']
end
return posts
end
puts "Enter feed url:"
url = gets.chomp
posts = get_posts(url)
puts posts.to_s
Any help would be great! I started this little thing to better learn to program, but I'm stuck. My output currently is []
Your CSS selector is wrong, entry.alternate
would select all entry elements with alternate class (that is something like <entry class="alternate" />
).
I suppose you want to select all link
elements that have rel
attribute with value of alternate
. CSS selector for this is link[rel=alternate]
. So change your code like this:
doc.css('link[rel=alternate]').each do |e|
puts e['href']
posts << e['href']
end
You can read more about CSS selectors here: http://www.w3.org/TR/CSS2/selector.html.
Try with doc.xpath "//entry/link[@rel='alternate']"
instead of doc.css('entry.alternate')
. It works for me.
If you only want the href attribute of the links, note that you can more simply do:
def get_posts(url)
Nokogiri::XML(open(url))
.xpath('//link[@rel="alternate"]/@href')
.map(&:value)
end
The XPath above selects not the link
elements, but the href
attributes on those elements; the map
then turns this array of Nokogiri::XML::Attr
objects into an array of just their values (as strings). Since this is the last expression in the method, the array is the return value.
精彩评论