开发者

How to parse a DocumentFragment with with the Java standard DOM API

开发者 https://www.devze.com 2023-03-27 03:03 出处:网络
This is how I can parse a well-formed XML document in Java: DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

This is how I can parse a well-formed XML document in Java:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();

// text contains the XML content
Document doc = builder.parse(new InputSource(new StringReader(text)));

An example for text is this:

<a>
  <b/>
</a>

How can I parse a DocumentFragment? For example, this:

<a>
  <b/>
</a>
<a>
  <b/>开发者_Python百科
</a>

NOTE: I want to use org.w3c.dom and no other libraries/technologies, if possible.


I just thought of a silly solution. I could wrap the fragment in a dummy element like this:

<dummy><a>
  <b/>
</a>
<a>
  <b/>
</a></dummy>

And then programmatically filter out that dummy element again, like this:

String wrapped = "<dummy>" + text + "</dummy>";
Document parsed = builder.parse(new InputSource(new StringReader(wrapped)));
DocumentFragment fragment = parsed.createDocumentFragment();

// Here, the document element is the <dummy/> element.
NodeList children = parsed.getDocumentElement().getChildNodes();

// Move dummy's children over to the document fragment
while (children.getLength() > 0) {
    fragment.appendChild(children.item(0));
}

But that's a bit lame, let's see if there is any other solution.


Further expanding on the answers already given:

public static DocumentFragment stringToFragment(Document document, String source) throws Exception
{
    source = "<dummy>" + source + "</dummy>";
    Node node = stringToDom(source).getDocumentElement();
    node = document.importNode(node, true);
    DocumentFragment fragment = document.createDocumentFragment();
    NodeList children = node.getChildNodes();
    while (children.getLength() > 0)
    {
        fragment.appendChild(children.item(0));
    }
    return fragment;
}


I would suggest not using the DOM API. It's slow and ugly.

Use streaming StAX instead. It's built into JDK 1.6+. You can fetch one element at a time, and it won't choke if you're missing a root element.

http://en.wikipedia.org/wiki/StAX

http://download.oracle.com/javase/6/docs/api/javax/xml/stream/XMLStreamReader.html

0

精彩评论

暂无评论...
验证码 换一张
取 消