libxml2 HTML parsing_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-01-12 10:19 出处：网络

I\'m parsing HTML with libxml2, using XPath to find elements. Once I found the element I\'m looking for, how can I get the HTML as a string from that element (keeping in mind that this element will ha

I'm parsing HTML with libxml2, using XPath to find elements. Once I found the element I'm looking for, how can I get the HTML as a string from that element (keeping in mind that this element will have many child elements). Given a document:

<html>
   开发者_C百科 <header>
        <title>Some document</title>
    </header

    <body>
        <p id="faq">
            Some kind of text <a href="http://www.nowhere.com/">here</a>.
        </p>
    </body>
</html>

Say I retrieved the body element with XPath and then get the HTML for that, I'd like to end up with a string containing:

<body>
    <p id="faq">
        Some kind of text <a href="http://www.nowhere.com/">here</a>.
    </p>
</body>

How can I do this?

That is the purpose of xmlNodeDump:

EDIT:

When you have a xmlNodePtr node, do something like:

xmlBufferPtr nodeBuffer = xmlBufferCreate();
xmlNodeDump(nodeBuffer, doc, node, 0, 1);
// ... Do something with nodeBuffer->content
// When done:
xmlBufferFree(nodeBuffer);

The 4th and 5th parameters control indentation and formatting.