开发者

Ruby Regex Help

开发者 https://www.devze.com 2022-12-22 17:27 出处:网络
I know a little bit of regex, but not mutch. What is the best way to get just the number out of the following html. (I want to have 32 returned). the values of width,row span, and size are all differe

I know a little bit of regex, but not mutch. What is the best way to get just the number out of the following html. (I want to have 32 returned). the values of width,row span, and size are all different in this horrible html page. Any help?

<td width=14 rowspan=2 align=right><font size=2 face="helvetica">32</font></td>
开发者_开发知识库


How about

>(\d+)<

Or, if you desperately want to avoid using capturing groups at all:

(?<=>)\d+(?=<)


Please, do yourself a favor:

#!/usr/bin/env ruby
require 'nokogiri'

require 'test/unit'
class TestExtraction < Test::Unit::TestCase
  def test_that_it_extracts_the_number_correctly
    doc = Nokogiri::HTML('<td width=14 rowspan=2 align=right><font size=2 face="helvetica">32</font></td>')
    assert_equal [32], (doc / '//td/font').map {|el| el.text.to_i }
  end
end


May be

<td[^>]*><font[^>]*>\d+</font></td>
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号