Okay... I know <- cant be excluded from strip_tags using allowable tags per say but im trying to use a work around. The work around works fine on character sets that wouldn't be valid HTML to begin with, such as << or <~ however when i use the code below to convert the <- or -> to digits before strip_tags is processed and then back from digits to the <- and -> after. But whenever those symbols show up all HTML from there on is removed, that or not processed. I understand i cant have it left alone through allowable tags which is why i convert it before the strip_Tags and back after... but its almost as if strip_Tags still removes it even though its converted back after the line where strip_tags is, since its removing <- and taking everything to the right of it.... Any ideas or other ways to try?开发者_高级运维 I've also tried defining <- as <—
and tried replacing it with other symbols as well, such as #- but no matter what i have the same outcome.
I should also mention the <- and -> arent used together, they are used to point to things in text. Like internt <- is misspelled there.
`<?php
$data = file_get_contents("test.html");
$data = str_replace("<-", "999", $data);
$data = str_replace("->", "998", $data);
$data = strip_tags($data, '');
$data = str_replace("999", "<-", $data);
$data = str_replace("998", "->", $data);
echo $data;
?>`
I was gathering sample data and realize if i remove a good chunk of the sample HTML everything works fine, turns out if i strip actual HTML comments such as <!-- Header //-->
on my own the conversion goes fine, so im going to look for a regex match to remove the HTML comments before the conversion and the striptags.
Update
I used the following code below to remove the HTML comments first, which results in success. Thanks for your help.
`$data = preg_replace('/<!--(.*)-->/', '', $data);`
Update:
$string = "<div>words words wrods <- words words</div>";
$string = str_replace('<-', '<-', $string);
echo strip_tags($string);
Output (Source):
words words wrods <- words words
Output (HTML):
words words wrods <- words words
精彩评论