I tried a simple script with
arr = data.scan /<td>([^<]+)/
and the arr is filled with the data within the <td>
and </td>
when it is run using
ruby try.rb
but when it is run using
ruby script/runner app/try.rb
so that it is run just like inside of script/console
, then now there is an extra </td>
attached to the matched data... Why would that be? It i开发者_如何学Gos Ruby 1.8.7 with Rails 2.3.8. Would it be due to unicode in the app environment or something else?
I would leave this as a comment because it doesn't really answer anything but I can't, I'm new around here and I guess I don't have the rep to do so, please excuse me.
I mocked the setup, used ruby 1.8.7 with an fully functional app on rails 2.3.8 and both times I got the proper output without the trailing you mention. Now I am curious as to what's in data ? I used a generic table into a pretty simple html document. Works as it should.
One last thing worth mentioning maybe, regex to parse html is it a good idea ? I never had the need to use it but hpricot looks pretty neat for just that sort of thing http://github.com/hpricot/hpricot.
Hope this helps at least a little.
精彩评论