开发者

DocumentBuilder parsing breaks string when hits '&'

开发者 https://www.devze.com 2023-02-15 02:16 出处:网络
i have this xml: <user> <name>H &amp; M</name> and i parse it using this code: DocumentBuilder documentBuilder开发者_如何学Python = null;

i have this xml:

<user>

<name>H &amp; M</name>

and i parse it using this code:


    DocumentBuilder documentBuilder开发者_如何学Python = null;
            Document document = null;

try { documentBuilder = DocumentBuilderFactory.newInstance() .newDocumentBuilder(); document = documentBuilder.parse(is); } catch (Exception e) { return result; } NodeList nl = document.getElementsByTagName(XML_RESPONSE_ROOT); if (nl.getLength() > 0) { resp_code = nl.item(0).getAttributes().getNamedItem( XML_RESPONSE_STATUS).getNodeValue(); if (resp_code.equals(RESP_CODE_OK_SINGLE)) { nl = document .getElementsByTagName(XML_RESPONSE_TAG_CONTACT); NodeList values = nl.item(i).getChildNodes();

etc..

when i get the node value by: node.getNodeValue();

i get only what's before the ampersand, even though the ampersand is escaped

i want to get the whole string: "H & M"

thanks


It depends on how your XML document was constructed. In particular, it can have multiple adjucent Text nodes in "H & M" while your code expects it to be just one. Try to use nodeVariable.normalize() before getting its value.

According to DOM parser API: "normalize() - Puts all Text nodes in the full depth of the sub-tree underneath this Node, including attribute nodes, into a "normal" form where only structure (e.g., elements, comments, processing instructions, CDATA sections, and entity references) separates Text nodes, i.e., there are neither adjacent Text nodes nor empty Text nodes..."


Find the "name" Element and call getTextContent().

0

精彩评论

暂无评论...
验证码 换一张
取 消