开发者

Get the Node value for the first Node

开发者 https://www.devze.com 2023-03-10 11:31 出处:网络
I have the following XML: <?xml version=\'1.0\' ?> <foo>A&gt;B</foo> and just want to get the node value of start tag as A&gt;B, if we use getNodeValue it will convert it

I have the following XML:

<?xml version='1.0' ?>
<foo>A&gt;B</foo>

and just want to get the node value of start tag as A&gt;B, if we use getNodeValue it will convert it to A>B which is not needed.

Hence I decided to use the Transformer

        Document doc = getParsedDoc(abovexml);
        TransformerFactory tranFact = TransformerFactory.newInstance();
        Transformer transfor = tranFact.newTransformer();
        transfor.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        Source src = new DOMSource(node);
        StringWriter buffer = new StringWriter();
        Result dest = new StreamResult(buffer);
        transfor.transform(src, dest);
        String result = buffer.toString();

But this gives the following output as part of result as <foo>A&gt;B</foo>

It will be helpful if somebody could clarify, if ther开发者_JAVA技巧e is an approach with which we can get A&gt;B without doing string manipulation from the above output (<foo>A&gt;B</foo>)


Since getNodeValue() is automatically decoding the the String.
You can use StringEscapeUtils from Apache Commons Lang to encode it again.

http://commons.apache.org/lang/api-2.6/org/apache/commons/lang/StringEscapeUtils.html
http://commons.apache.org/lang/

String nodeValue = StringEscapeUtils.escapeHtml(getNodeValue());

That would encode it into the format you want it to be in. It is not very performance friendly because you are applying encode for every node value.


Actually getNodeValue() is not "converting" the string. When the XML is parsed from a file, or produced by a transformation, the resulting information model is that the string is A>B, not A&gt;B. The latter is just a serialization form.

Another legitimate serialization form is A>B (because right angle bracket does not need to be escaped in most cases). However, there may be compatibility reasons for wanting to produce A&gt;B, especially if your output is intended to be HTML (though you didn't mention that).

If you have a good reason for escaping the >, then I agree with @kensen john's answer for getting that done.

0

精彩评论

暂无评论...
验证码 换一张
取 消