I parse a web page with 开发者_高级运维the help of xPath and retrieving the content of the div element, it omits HTML that is contained in that div element. How to make it retrieve the whole content of the div element with HTML included?
Use:
someExprSelectingtheDiv/node()
This selects all the children nodes (markup and text) of the div
s selected in the first location step of the expression.
Do not work with the string()
value of any selected element because this is only the concatenated (in document order) text descendants of this element.
Also, the string value of a node-set is the string value of the first node (in document order) of this node-set.
This information should be sufficient to explain the observed behavior in evaluating the problematic XPath expression.
精彩评论