I want to parse a HTML file using Nokogiri. I am able to do that but I o开发者_如何学Cnly want text and no CDATA or JavaScript, since my script and div tags are all over the file.
You can delete all script elements,
doc.search('script').remove
… and then select all text elements
doc.xpath('//text()')
… or just select the text elements within div elements
doc.xpath('//div//text()')
精彩评论