开发者

Get Xpath from the org.w3c.dom.Node

开发者 https://www.devze.com 2023-02-11 18:52 出处:网络
Can i get the full xpath from the org.w3c.dom.Node ? Say currently node is pointing to some where the middle of the xml document. I would like extract the xpath for that element.

Can i get the full xpath from the org.w3c.dom.Node ?

Say currently node is pointing to some where the middle of the xml document. I would like extract the xpath for that element.

The output xpath I'm looking for is //parent/child1/chiild2/child3/node. A parent to node xpath. Just ignore the xpath's which are 开发者_C百科having expressions and points to the same node.


There's no generic method for getting the XPath, mainly because there's no one generic XPath that identifies a particular node in the document. In some schemas, nodes will be uniquely identified by an attribute (id and name are probably the most common attributes.) In others, the name of each element (that is, the tag) is enough to uniquely identify a node. In a few (unlikely, but possible) cases, there's no one unique name or attribute that takes you to a specific node, and so you'd need to use cardinality (get the n'th child of the m'th child of...).

EDIT: In most cases, it's not hard to create a schema-dependent function to assemble an XPath for a given node. For example, suppose you have a document where every node is uniquely identified by an id attribute, and you're not using namespaces. Then (I think) the following pseudo-Java would work to return an XPath based on those attributes. (Warning: I have not tested this.)

String getXPath(Node node)
{
    Node parent = node.getParent();
    if (parent == null) {
        return "/" + node.getTagName();
    }
    return getXPath(parent) + "/" + "[@id='" + node.getAttribute("id") + "']";
}


I am working for the company behind jOOX, a library that provides many useful extensions to the Java standard DOM API, mimicking the jquery API. With jOOX, you can obtain the XPath of any element like this:

String path = $(element).xpath();

The above path will then be something like this

/document[1]/library[2]/books[3]/book[1]


I've taken this code from Mikkel Flindt post & modified it so it can work for Attribute Node.

public static String getFullXPath(Node n) {
// abort early
if (null == n)
  return null;

// declarations
Node parent = null;
Stack<Node> hierarchy = new Stack<Node>();
StringBuffer buffer = new StringBuffer();

// push element on stack
hierarchy.push(n);

switch (n.getNodeType()) {
case Node.ATTRIBUTE_NODE:
  parent = ((Attr) n).getOwnerElement();
  break;
case Node.ELEMENT_NODE:
  parent = n.getParentNode();
  break;
case Node.DOCUMENT_NODE:
  parent = n.getParentNode();
  break;
default:
  throw new IllegalStateException("Unexpected Node type" + n.getNodeType());
}

while (null != parent && parent.getNodeType() != Node.DOCUMENT_NODE) {
  // push on stack
  hierarchy.push(parent);

  // get parent of parent
  parent = parent.getParentNode();
}

// construct xpath
Object obj = null;
while (!hierarchy.isEmpty() && null != (obj = hierarchy.pop())) {
  Node node = (Node) obj;
  boolean handled = false;

  if (node.getNodeType() == Node.ELEMENT_NODE) {
    Element e = (Element) node;

    // is this the root element?
    if (buffer.length() == 0) {
      // root element - simply append element name
      buffer.append(node.getNodeName());
    } else {
      // child element - append slash and element name
      buffer.append("/");
      buffer.append(node.getNodeName());

      if (node.hasAttributes()) {
        // see if the element has a name or id attribute
        if (e.hasAttribute("id")) {
          // id attribute found - use that
          buffer.append("[@id='" + e.getAttribute("id") + "']");
          handled = true;
        } else if (e.hasAttribute("name")) {
          // name attribute found - use that
          buffer.append("[@name='" + e.getAttribute("name") + "']");
          handled = true;
        }
      }

      if (!handled) {
        // no known attribute we could use - get sibling index
        int prev_siblings = 1;
        Node prev_sibling = node.getPreviousSibling();
        while (null != prev_sibling) {
          if (prev_sibling.getNodeType() == node.getNodeType()) {
            if (prev_sibling.getNodeName().equalsIgnoreCase(
                node.getNodeName())) {
              prev_siblings++;
            }
          }
          prev_sibling = prev_sibling.getPreviousSibling();
        }
        buffer.append("[" + prev_siblings + "]");
      }
    }
  } else if (node.getNodeType() == Node.ATTRIBUTE_NODE) {
    buffer.append("/@");
    buffer.append(node.getNodeName());
  }
}
// return buffer
return buffer.toString();
}          


For me this one worked best ( using org.w3c.dom elements):

String getXPath(Node node)
{
    Node parent = node.getParentNode();
    if (parent == null)
    {
        return "";
    }
    return getXPath(parent) + "/" + node.getNodeName();
}


Some IDEs specialised in XML will do that for you.

Here are the most well known

  1. oXygen
  2. Stylus Studio
  3. xmlSpy

For instance in oXygen, you can right-click on an element part of an XML document and the contextual menu will have an option 'Copy Xpath'.

There are also a number of Firefox add-ons (such as XPather that will happily do the job for you. For Xpather, you just click on a part of the web page and select in the contextual menu 'show in XPather' and you're done.

But, as Dan has pointed out in his answer, the XPath expression will be of limited use. It will not include predicates for instance. Rather it will look like this.

/root/nodeB[2]/subnodeX[2]

For a document like

<root>
   <nodeA>stuff</nodeA>
   <nodeB>more stuff</nodeB>
   <nodeB cond="thisOne">
       <subnodeX>useless stuff</subnodeX>
       <subnodeX id="MyCondition">THE STUFF YOU WANT</subnodeX>
       <subnodeX>more useless stuff</subnodeX>
   </nodeB>
</root>

The tools I listed will not generate

/root/nodeB[@cond='thisOne']/subnodeX[@id='MyCondition']

For instance for an html page, you'll end-up with the pretty useless expression :

/html/body/div[6]/p[3]

And that's to be expected. If they had to generate predicates, how would they know which condition is relevant ? There are zillions of possibilities.


Something like this will give you a simple xpath:

public String getXPath(Node node) {
    return getXPath(node, "");
}

public String getXPath(Node node, String xpath) {
    if (node == null) {
        return "";
    }
    String elementName = "";
    if (node instanceof Element) {
        elementName = ((Element) node).getLocalName();
    }
    Node parent = node.getParentNode();
    if (parent == null) {
        return xpath;
    }
    return getXPath(parent, "/" + elementName + xpath);
}
0

精彩评论

暂无评论...
验证码 换一张
取 消