开发者

Xpath to the tag inside CDATA

开发者 https://www.devze.com 2023-03-27 04:12 出处:网络
I want to find the xpath to a tag whi开发者_JAVA技巧ch is inside a CDATA. Below the xml fragment.

I want to find the xpath to a tag whi开发者_JAVA技巧ch is inside a CDATA. Below the xml fragment.

<books>
 <book>
  <title></title>
  <content><![CDATA[<p>Hi hello Hw r u?</p><p>We are fine</p><p>Hi babeeee!!!!</p>]]>    </content>
 </book>
</books>

I want to get the data which is inside the first <p> tag inside <content>. Can anybody please give the correct xpath to it?


CDATA contains arbitrary character data. In contradiction to PCDATA (acronym of parsed character data) it is not parsed, so there is no xpath to "elements" inside of it.


As Leif said, the content in the CDATA section is not parsed, so it's just text, even though it looks like markup. You'd have to parse it. Which you could do using Saxon (9.1 or later commercial editions) and saxon:parse. You'd then find it's not well formed, so you'd probably have to resort to a parser such as TagSoup to parse it.

You could also treat it as a string:

<xsl:stylesheet version="1.0"
  xmlns:saxon="http://saxon.sf.net/"
  exclude-result-prefixes="saxon"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <Root>
      <!--xsl:value-of select="saxon:parse(/books/book/content)"/-->
      <xsl:for-each select="books/book/content">
        <xsl:value-of select="
          substring-before(
          substring-after( . , '&gt;' ), '&lt;' ) "/>
      </xsl:for-each>
    </Root>
  </xsl:template>
</xsl:stylesheet>
0

精彩评论

暂无评论...
验证码 换一张
取 消