How do you grab the content of xPath without copying the surrounding mark?
<div id="node-123" class="clearfix">
<div class="content">
<div class="body">
<p><img src="/images/image.jpg"/></p>
开发者_运维知识库 <p>Some content ....</p>
</div>
</div>
</div>
If I used //div[@id='node-123']/div/div
, I still get surrounding <div class="body">
which is not expected.
What I want is the content of <div class="body">
, excluding this <div class="body">
markup, but reserving other markups inside the content, p, img, etc.
I tried to use wildcard: //div[@id='node-123']/div/div/*
, but this only fetch the first p
, where p
can be two or many. Using node() fetch nothing.
Any hint would be very much appreciated.
Thanks
If I used
//div[@id='node-123']/div/div
, I still get surrounding<div class="body">
which is not expected.What I want is the content of
<div class="body">
, excluding this<div class="body">
markup, but reserving other markups inside the content,p
,img
, etc.
Use:
//div[@id='node-123']/div/div/node()
This selects all nodes (elements, text-nodes, processing-instructions and comment-nodes) that are children of any div
element that is a child of any div
element that is a child of any div
element in the document such that the value of its id
attribute is 'node-123'.
Warning: It is always a good practice not to use the //
pseudo-operator if the structure of the XML document is statically known. Using the //
pseudo -operator results most-often in very slow performance, causing complete tree traversal.
The problem is unterminated img tag at actual original article: <img src="/images/image.jpg">
rather than <img src="/images/image.jpg"/>.
精彩评论