What are good options to restrict the type of html tags a user is allowed to enter into a form field? I'd like to be able to do that client side (presumably using JavaScript), server-side in PHP if it's too heavy for the user's browser, and possibly a combo of both if appropriate.
Effectively I'd like users to be able to submit data with the same tag-set as on Stackoverflow, plus maybe the stan开发者_Go百科dard MathML tags. The form must accept UTF-8 text, including Asian ideograms, etc.
In the application, the user must be able to submit text-entries with basic html tags, and those entries must be able to be displayed to (potentially different) users with the html rendered correctly in a way that is safe to the users. I'm planning to use htmlspecialchars()
and htmlspecialchars_decode()
to protect my db server-side.
Many thanks,
JDelage
PS: I searched but couldn't find this question...
If you're looking to filter input agains XSS attacks etc., consider using an existing library like HTML Purifier. I've not used it myself yet but it promises a lot and is in high regard.
HTML Purifier is a standards-compliant HTML filter library written in PHP. HTML Purifier will not only remove all malicious code (better known as XSS) with a thoroughly audited, secure yet permissive whitelist, it will also make sure your documents are standards compliant, something only achievable with a comprehensive knowledge of W3C's specifications.
I think is way easy to use strip_tags and just specify the tags you are allowing.
You could do something like this, if you are familiar with regular expressions:
<?php
function parse($string)
{
//To stop unwanted HTML tags being used
$string = str_replace("<","<",$string); //Replace all < with the HTML equiv
$string = str_replace(">",">",$string); //Replace all > with the HTML equiv
$find = array(
"%\*\*\*(.+?)\*\*\*%s", //Search for ***any string here***
"%`(.+?)`%s", //Search for `any string here`
);
$replace = array(
"<b>\\1</b>", //Replace with <b>any string here</b>
"<span style=\"background-color: #DDDDDD\">\\1</span>" //Replace with <span style="background-color: #DDDDDD">any string here</span>
);
$string = preg_replace($find,$replace,$string); //Do the find and replace
return $string; //Return the output
}
echo parse("***Hello*** `There` <b>Friend</b>");
?>
Outputs:
Hello There
<b>Friend</b>
I had similar problem for some time. There were some $%^&*) who liked to post some comments like <script>alert('Hello');</script>
or something like that. I got tired and made a small function, which helped me, to allow, only <br>
or <br />
tags for normal view of message.
I did it only in PHP, but I think it might help you.
function eliminateTags($msg) {
$setBrakes = nl2br($msg);
$decodeHTML = htmlspecialchars_decode($setBrakes);
# Check PHP version
if(version_compare(PHP_VERSION, '5.2') == 1) {
$withoutTags = strip_tags($decodeHTML, "<br />");
} else {
$withoutTags = strip_tags($decodeHTML, "<br>");
}
return $withoutTags;
}
精彩评论