I am parsing HTML text using nokogiri and making some changes to that HTML.
doc = Nokogiri::HTML.parse(html_code)
But i am using mustache with that html so the html contains mustache variables which are in enclosed in curly braces e.g.{{mustache_variable}}.
After tinkering with the nokogiri document, when i do
doc.to_html
These curly braces are escaped and i get something like %7B%7Bmustache_variable%7D%7D
But, not all of the content is escaped, e.g. if i have html as
<label> {{mustache_variable}} </label>
It returns, <label> {{mustache_variable}} </label>
But for html like, <img src='{{mustache_variable}}'>
It returns, <img src='%7B%7Bmustache_variable%7D%7D'>
So, i am currentl开发者_Go百科y doing a gsub to replace %7B and %7D with { and } respectively so mustache works.
So, is there a way i can get the exact html from nokogiri or a better solution ???
Probably you need cgi module
require 'cgi'
doc = Nokogiri::HTML.parse(html_code)
CGI.unescapeHTML(doc.to_html)
or you can use htmlentities lib.
And try to use doc.content instead of doc.to_html
I ran into this same problem and ended up using a regular expression to convert the escaped double braces:
html_doc.gsub(/%7B%7B(.+?)%7D%7D/, '{{\1}}')
To make this safer, I'd recommend prefixing each mustache variable with a namespace, just in case some of the HTML does have the escaped double brace pattern intentionally, e.g.
html_doc.gsub(/%7B%7Bnamespace(.+?)%7D%7D/, '{{namespace\1}}')
精彩评论