开发者

Which function in php validate if the string is valid html?

开发者 https://www.devze.com 2023-01-05 07:53 出处:网络
Which function in php validate if the string is html? My target to take input from user and check if input html and not just string.

Which function in php validate if the string is html? My target to take input from user and check if input html and not just string.

Example for not html string:

sdkjshdk<div>jd</h3>ivdfadfsdf or sdkjshdkivdfadfsdf

Example for html开发者_开发百科 string:

<div>sdfsdfsdf<label>dghdhdgh</label> fdsgfgdfgfd</div>

Thanks


Maybe you need to check if the string is well formed.

I would use a function like this

function check($string) {
  $start =strpos($string, '<');
  $end  =strrpos($string, '>',$start);

  $len=strlen($string);

  if ($end !== false) {
    $string = substr($string, $start);
  } else {
    $string = substr($string, $start, $len-$start);
  }
  libxml_use_internal_errors(true);
  libxml_clear_errors();
  $xml = simplexml_load_string($string);
  return count(libxml_get_errors())==0;
}

Just a warning: html permits unbalanced string like the following one. It is not an xml valid chunk but it is a legal html chunk

<ul><li>Hi<li> I'm another li</li></ul>

Disclaimer I've modified the code (without testing it). in order to detect well formed html inside the string.

A last though Maybe you should use strip_tags to control user input (As I've seen in your comments)


You can use DomDocument's method loadHTML


simplexml_load_string will fail if you don't have a single root node. So if you try this html:

<p>A</p><p>B</p> it will be invalid.

Here's my function:

function check($string){
    $start = strpos($string, '<');
    $end = strrpos($string, '>', $start);

    if ($end !== false) {
        $string = substr($string, $start);
    } else {
        $string = substr($string, $start, strlen($string) - $start);
    }

    // xml requires one root node
    $string = "<div>$string</div>";

    libxml_use_internal_errors(true);
    libxml_clear_errors();
    simplexml_load_string($string);

    return count(libxml_get_errors()) == 0;
}


Do you mean HTML or XHTML?

The HTML standard and interpretation are so loose that your first snippet might work. It won't be pretty but you might get something.

XHTML is quite a bit more strict and at minimum will expect your snippet to be well-formed (all opened tags are closed; tags can nest but not overlap) and may throw warnings if you have unrecognized elements or attributes.

Something like Tidy - http://php.net/manual/en/book.tidy.php - is probably a good start. Once you load your snippet using that, you can use tidy_error_count or tidy_get_error_buffer to see if it's "okay enough" for your needs.


Are you trying to prevent users from posting html tags instead of strings? Cause if this is what you want to do you just need striptags()

Wich will remove any html tags from the string.


you should use:

$html="<html><body><p>This is array.</p><br></body></html>";

libxml_use_internal_errors(true);
$dom = New DOMDocument();
$dom->loadHTML($html);
if (empty(libxml_get_errors())) {
  echo "This is a good HTML";
}else {
  echo "This not html";
}


If you want to make your site secure also, you certainly have to use an HTML purifier like htmlpurifier, tidy etc.

0

精彩评论

暂无评论...
验证码 换一张
取 消