I have a function that strips out un-needed whitespaces from the output of my php page prior to saving the page to an HTML file for caching purposes.
However in some sections of my page I have source code in pre tags and these whitespaces effect how the code is displayed. My skill with regular expressions is horrible so I am basically look for a solution to stop this function from messing with code inside:
<pre></pre>
This is the php function
function sanitize_output($buffer)
{
$search = array(
'/\>[^\S]+/s', //strip whitespaces after tags, except space
'/[^\S ]+\</s', //strip whitespaces before tags, except space
'/(\s)+/s', // shorten multiple whitespace sequences
);
$replace = array(
'>',
'<',
'\\1',
);
$buffer = preg_replace($search, $replace, $buffer);
return $buffer;
}
Thanks for your help.
Heres what i found to be working :
Solution:
function stripBufferSkipPreTags($buffer){
$poz_current = 0;
$poz_end = strlen($buffer)-1;
$result = "";
while ($poz_current < $poz_end){
$t_poz_start = stripos($buffer, "<pre", $poz_current);
if ($t_poz_start === false){
$buffer_part_2strip = substr($buffer, $poz_current);
$temp = stripBuffer($buffer_part_2strip);
$result .= $temp;
$poz_current = $poz_end;
}
else{
$buffer_part_2strip = substr($buffer, $poz_current, $t_poz_start-$poz_current);
$temp = stripBuffer($buffer_part_2strip);
$result .= $temp;
开发者_运维问答 $t_poz_end = stripos($buffer, "</pre>", $t_poz_start);
$temp = substr($buffer, $t_poz_start, $t_poz_end-$t_poz_start);
$result .= $temp;
$poz_current = $t_poz_end;
}
}
return $result;
}
function stripBuffer($buffer){
// change new lines and tabs to single spaces
$buffer = str_replace(array("\r\n", "\r", "\n", "\t"), ' ', $buffer);
// multispaces to single...
$buffer = preg_replace(" {2,}", ' ',$buffer);
// remove single spaces between tags
$buffer = str_replace("> <", "><", $buffer);
// remove single spaces around
$buffer = str_replace(" ", " ", $buffer);
$buffer = str_replace(" ", " ", $buffer);
return $buffer;
}
Regular expressions are known to be evil (see this and this) when it comes to parsing HTML.
That said, try to do what you need in another way, like using a DOM parser and customizing its HTML output functions.
If you are compressing for disk-space, you should consider using gz compression. (php.net/gz_deflate)
精彩评论