I'm trying to figure out ho开发者_如何学Pythonw to use XPath to get the exceptionID and instrumentID values out of the XML snippet in the following XML document (yes having XML in the CDATA is a little odd, but that's what I get from the 3rd party service)
<?xml version="1.0"?>
<exception>
<info>
<![CDATA[
<info>
<exceptionID>1</exceptionID>
<instrumentID>1</instrumentID>
</info>
]]>
</info>
</exception>
Is it possible to get the values in one XPath statement?
I'm using javax.xml.xpath.XPath inside Java (JDK 1.5 with Xalan 2.7.1 and Xerces 2.9.1), e.g.
XPath xpath = XPathFactory.newInstance().newXPath();
Long exceptionId = new Long(((Double)xpath.evaluate(this.exceptionIdXPath,
document, XPathConstants.NUMBER)).longValue());
It's the this.exceptionIdXPath variable that I'm not sure how to set, I know for example that:
/exception/info/text()/info/exceptionID
won't work (text() returns the data inside the CDATA but with no 'knowledge' that it is XML)
Yes, you can do it. But anything inside the CDATA section is a string and won't be part of the DOM. Therefore, you have to use XPath's string manipulation functions.
In XPath you can use substring-before and substring-after. Something like this may work:
substring-before(substring-after(/exception/info,"<exceptionID>"), "</exceptionID>")
This is going to be very specific to the tools you're using (it would be good to know what platform and libraries you're using), but generally you can't do this in a single step. The whole point of CDATA is that it's raw character data and not necessarily XML.
What you can do is capture the text() in exception/info (basically the contents of your CDATA block) and create a new XML document (in memory) from that, and then use XPath over that document.
The detailed steps for this are platform-dependant.
精彩评论