开发者

ruby on rails regular expression find and remove tags between tags in html string

开发者 https://www.devze.com 2022-12-15 00:09 出处:网络
I\'m working in ruby on rails and need the following: remove all \"br\" html tags between \"code\" html tags in a string of html. The \"code\" tags might occur more than once.

I'm working in ruby on rails and need the following:

remove all "br" html tags between "code" html tags in a string of html. The "code" tags might occur more than once.

Now, it's not screen scraping I'm trying to do. I have a blog and would li开发者_如何转开发ke to allow people to use the code html tags only in the comments. So when formatting the string I normally use simple_format but I'd like it to ignore code html tags.

Thanks in advance.


If you absolutely positively have to use regexp, try this one, which catches all <br>, <br/> and <br /> tags:

str.gsub(/<code>.+?<\/code>/) {|s| s.gsub(/<br\s*\/?>/, "")}

Tested with:

str = "Lorem ipsum dolor sit amet<br />, <code>consectetur adipisicing elit<br />, sed do eiusmod tempor incididunt ut labore<br> et dolore magna aliqua</code>. Ut enim ad minim veniam,<br> quis nostrud exercitation ullamco laboris nisi<br/> ut aliquip ex ea commodo consequat. <code>Duis aute irure dolor in reprehenderit<br /> in voluptate velit esse cillum dolore<br/> eu fugiat nulla pariatur.</code> Excepteur sint occaecat cupidatat non proident,<br /> sunt in culpa qui officia deserunt mollit anim id est laborum."
p str.gsub(/<code>.+?<\/code>/) {|s| s.gsub(/<br\s*\/?>/, "")}

If you don't have to use regexp, use an html parser like nokogiri.


Using Hpricot or a HTML parser of your choice would be a far, far better idea.


I second on Hpricot, but what are trying to do? Attempting to do some sort of web-scraping or are you parsing the HTML from a model?

0

精彩评论

暂无评论...
验证码 换一张
取 消