开发者

XPath with Java - Selecting a text value between subtags

开发者 https://www.devze.com 2023-02-11 01:54 出处:网络
I\'m working on this html snippet: <p class=\"pageSelector\"> <a href=\"/BlaBla\">&lt; Prev</a>&nbsp;

I'm working on this html snippet:

<p class="pageSelector">
    <a href="/BlaBla">&lt; Prev</a>&nbsp;
    <a href="/BlaBla">1</a>&nbsp;
  开发者_C百科  <a href="/BlaBla">2</a>&nbsp;
    <a href="/BlaBla">3</a>&nbsp;
    4&nbsp;
    <a href="/BlaBla">5</a>&nbsp;
    <a href="/BlaBla">6</a>&nbsp;
    <a href="/BlaBla">Next &gt;</a>&nbsp;
</p>

rendered (more or less) as < Prev 1 2 3 4 5 6 Next > .

I want to select the "4" because I need to discover the 'current' page. Using

//p[@class='pageSelector']/text()[normalize-space()]

(tested with Firefox XPath Ckecker) I thougth I'd solved but no, because I obtained 7 matches.

Anyone please could tell me where I'm wrong? Thank you


normalize-space removes whitespace, but the no-break-space character (despite its visual appearance) is not considered to be whitespace for this purpose. So I would do

text()[translate(., '&#x20;&#x09;&#x0a;&#x0d;&#xa0;', '')]

which will return you those child text nodes that contain a character other than whitespace or no-break-space; you may then need to process this further to extract the part of the content you want.


if your using xslt you could apply a further template onto the a tags using

<xsl:template match="p[@class='pageSelector']/a/text()[normalize-space()]">
</xsl:template>

this will mean that your left with just the 4

0

精彩评论

暂无评论...
验证码 换一张
取 消