Strange behavior with tagsoup and Groovy's XmlSlurper_问答_开发者

Strange behavior with tagsoup and Groovy's XmlSlurper

开发者 https://www.devze.com 2023-02-07 03:18 出处：网络

Let\'s say I want to parse the phone number from an an xml string like this: str 开发者_运维知识库= \"\"\" <root>

Let's say I want to parse the phone number from an an xml string like this:

str 开发者_运维知识库= """ <root> 
            <address>123 New York, NY 10019
                <div class="phone"> (212) 212-0001</div> 
            </address> 
        </root> 
    """
parser = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser()).parseText (str)
println parser.address.div.text()

It doesn't print the phone number.

If I change the "div" element to "foo" like this

str = """ <root> 
            <address>123 New York, NY 10019
                <foo class="phone"> (212) 212-0001</foo> 
            </address> 
        </root> 
    """
parser = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser()).parseText (str)
println parser.address.foo.text()

Then its able to parse and print the phone number.

What the heck is going on?

Btw I am using groovy 1.7.5 and tagsoup 1.2

Just change code to

println parser.address.'div'.text()

This is curse of Groovy and many other dynamic language - "div" is reserved method name thus you don't get node but rather try to divide "address" node :)

I seem to recall that tagsoup normalizes HTML tags - i.e. it uppercases them. So the GPath expression you want is probably

println parser.ADDRESS.DIV.text()

I find it handy to be able to print out the result of the parse - then you can see why your GPath isn't working. Use this..

println groovy.xml.XmlUtil.serialize(parser)

I know that this question is very old. But I faced recently and this is what I used:

parser.'**'.findAll { it.name() == 'div' && it.@class.text() == 'phone' }.each { div ->
    println div.text()
}

Using depthFirst find all tags
Filter by name div that has class phone;
Print the value (212) 212-0001

Groovy version is 2.4

Strange behavior with tagsoup and Groovy's XmlSlurper

精彩评论

关注公众号

热门标签

图文推荐

Strange behavior with tagsoup and Groovy's XmlSlurper

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：