开发者

How do I fix this indentation issue with DOMDocument?

开发者 https://www.devze.com 2023-02-27 08:43 出处:网络
I just started using the DOMDocument Object since I want to parse an uploaded HTML File and then use it as a template for my cms.

I just started using the DOMDocument Object since I want to parse an uploaded HTML File and then use it as a template for my cms.

I'm loading HTML from a file and - for testing purpose - save it as a new html file without changing anything. The problem is: the indentation is messed up.

Here's what my HTML file looks like:

<!DOCTYPE html>
<html>
    <head开发者_StackOverflow社区>
        <title>DOM Testpage</title>
        <meta http-equiv="content-type" content="text/html; charset=UTF-8" />
        <meta name="language" content="deutsch, de" />
    </head>
    <body>
        <div class="pageOverlay"></div>
        <div style="height:100px;"></div>
        <div id="LoginForm">
            <div id="LoginLogo">
                Here's some Text
                <br />
                And another Text with some German Umlauts: öäü ÖÄÜ ß and so on...
                <br />
            </div>
            <form method="post" action="">
                <!-- Here be dragons. And a nice comment -->
                <input type="text" name="cms_user" value="" class="InputText " data-defaultvalue="Username" title="Please enter your username." style="margin:0px 0px 20px 0px;" />
                <input type="password" name="cms_password" value="" class="InputText " data-defaultvalue="Password" title="Please enter your password." style="margin:0px 0px 20px 0px;" />
                <input type="checkbox" name="cms_remember_login" value="1" id="cms_remember_login" />
                <label for="cms_remember_login" style="line-height:14px; margin-left:5px;">Remember Login</label>
                <!-- Another comment
                This one's even
                longer -->
                <input type="submit" name="submitLogin" value="Login" />
            </form>
        </div>
    </body>
</html>

The PHP part:

<?php
    $lo_dom = new DOMDocument();
    $lo_dom->loadHTMLFile("test.html");
    $lo_dom->saveHTMLFile("templates/test_neu.html");
?>

When I open the new HTML file, the source looks like this:

<!DOCTYPE html>
<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><title>DOM Testpage</title><meta name="language" content="deutsch, de"></head><body>
        <div class="pageOverlay"></div>
        <div style="height:100px;"></div>
        <div id="LoginForm">
            <div id="LoginLogo">
                Here's some Text
                <br>
                And another Text with some German Umlauts: &ouml;&auml;&uuml; &Ouml;&Auml;&Uuml; &szlig; and so on...
                <br></div>
            <form method="post" action="">
                <!-- Here be dragons. And a nice comment -->
                <input type="text" name="cms_user" value="" class="InputText " data-defaultvalue="Username" title="Please enter your username." style="margin:0px 0px 20px 0px;"><input type="password" name="cms_password" value="" class="InputText " data-defaultvalue="Password" title="Please enter your password." style="margin:0px 0px 20px 0px;"><input type="checkbox" name="cms_remember_login" value="1" id="cms_remember_login"><label for="cms_remember_login" style="line-height:14px; margin-left:5px;">Remember Login</label>
                <!-- Another comment
                This one's even
                longer -->
                <input type="submit" name="submitLogin" value="Login"></form>
        </div>
    </body></html>

I already tried setting preserveWhiteSpace and formatOutput but that doesn't change anything.

It's not a big deal at all but it'd be nice if the output would look like the input.

Any ideas how to fix this?

And another question: is there a way to manually insert a \n linebreak after I added another node with appendChild()?


The correct way to reformat a document with DOM is

$dom = new DOMDocument();
$dom->preserveWhiteSpace = FALSE;
$dom->loadHTMLFile("test.html");
$dom->formatOutput = TRUE;
$dom->saveHTMLFile("templates/test_neu.html");

If that doesnt result in the desired output, you can still add whitespace yourself. Any Whitespace used for formatting purposes is a DOMText Node. See my answers

  • DOMDocument in php and
  • Printing content of a XML file using XML DOM

for a more detailed explanation. An alternative to that would be to use Tidy to reformat the code or any of the tools suggested in https://stackoverflow.com/search?q=html+beautifier+php


Came across this question whilst looking for a solution to indent XSLTProcessor output. Here is an ungraceful alternative approach that might save somebody some time :

$xml -> preserveWhiteSpace = false;
$xml -> formatOutput = true;

$html = $xml -> saveXML();
$html = strstr( $html, '<html' );

file_put_contents( 'output.html', $html );

Any other configuration didn't work, for me in any case.

0

精彩评论

暂无评论...
验证码 换一张
取 消