Getting portion of href attribute using hpricot_问答_开发者

Getting portion of href attribute using hpricot

开发者 https://www.devze.com 2023-01-23 07:49 出处：网络

I think I need a combo of hpricot and regex here. I need to search for \'a\' tags with an \'href\' attribute that starts with \'abc/\', and returns the text following that until the next forward slash

I think I need a combo of hpricot and regex here. I need to search for 'a' tags with an 'href' attribute that starts with 'abc/', and returns the text following that until the next forward slash '/'.

So, given:

<a href="/abc/12345/xyz123/">One</a>
<a href="/abc/67890/xyzabc/">Two</a>

I need to get back: '12345' and '67开发者_开发知识库890'

Can anyone lend a hand? I've been struggling with this.

You don't need regex but you can use it. Here's two examples, one with regex and the other without, using Nokogiri, which should be compatible with Hpricot for your use, and uses CSS accessors:

require 'nokogiri'

html = %q[
  <a href="/abc/12345/xyz123/">One</a>
  <a href="/abc/67890/xyzabc/">Two</a>
]

doc = Nokogiri::HTML(html)
doc.css('a[@href]').map{ |h| h['href'][/(\d+)/, 1] } # => ["12345", "67890"]
doc.css('a[@href]').map{ |h| h['href'].split('/')[2] } # => ["12345", "67890"]

or use regex:

s = '<a href="/abc/12345/xyz123/">One</a>'
s =~ /abc\/([^\/]*)/
return $1

What about splitting the string by /?

(I don't know Hpricot, but according to the docs):

doc.search("a[@href]").each do |a|
    return a.somemethodtogettheattribute("href").split("/")[2]; // 2, because the string starts with '/'
end

Getting portion of href attribute using hpricot

精彩评论

关注公众号

热门标签

图文推荐

Getting portion of href attribute using hpricot

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：