开发者

How to get content without nested elements with Nokogiri

开发者 https://www.devze.com 2023-02-03 20:01 出处:网络
src = \'<paragraph>And bla foo <note>not important</note> bar baz</paragraph>\'
src = '<paragraph>And bla foo <note>not important</note> bar baz</paragraph>'
doc = Noko开发者_Go百科giri::XML(src)
puts doc.xpath('paragraph').first.content

The code above returns:

"And bla foo not important bar baz"

I am looking for a way to get content without nested elements. The case above is just an example XML, but in this example I want this as a result:

"And bla foo bar baz"


puts doc.xpath('paragraph/child::text()')

I've not used XPath in anger for many years but that seems to work.

Or better yet:

puts doc.xpath('paragraph/child::text()').to_s.squeeze(' ')


You could do something like

doc.xpath('paragraph').children.map { |e| e.text if e.text? }.join

That will return 'And bla foo bar baz' from your example

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号