I'd like to get the actual url strings from the hyperlinks. I'd like my result to be stripped of html.
So, if one of my input string开发者_运维技巧s is
<a href="http://target.com/resource.tar.gz">resource</a>
I'd like to get:
http://target.com/resource.tar.gz
How can I do this?
In Hpricot you access attributes of an element using square brackets (like you would when accessing elements in a Hash). So, to use your example:
doc = Hpricot('<a href="http://target.com/resource.tar.gz">resource</a>')
puts doc.at('a')['href'] # => http://target.com/resource.tar.gz
精彩评论