I have a MySQL database, being fed data from a PHP powered form. The table columns are collated as utf8_bin, the connection charset is set at utf8, as is the HTML.
After extensive Googling, I cannot seem to find any clear way of using preg_replace to strip unwanted characters (and numbers) but keep upper/lowercase accents, umlauts and spaces. I've cobbled together something that seems to work - but I don't understand it at all, so have no idea how secure it is. Hence the doubling up with the escape clause:
$lname = preg_replace("/(<\/?)(\w+)([^>]*>)/e","", $lname);
$lname = mysql_real_escape_string($lname);
What I really need is the kind of clause that could take the following name (mine, as an example): "Éamonn Mac Lochlainn" and store it as such, rather than "c389616d6f6e6eMacLochlainn" I've looked at strip_tags also, allowing "ÁÉÍÓÚáéíóú". Is that the way forward?
Any help - and, in particular, explanations of what's goin开发者_如何学运维g on in this snippet (the \w+ bits)- would be greatly appreciated.
\w
is a word character according to the current locale. If that is set correctly for all the data: no problem. If your locale is not enough, you could say all letters & whitespace are valid:
$lname = preg_replace('/[^\s\p{L}]/u','',$lname);
For more information about \w
, see Escape sequences
For more information about unicode properties (the \p
in combination with the /u
switch), see Unicode Properties
You seem to do a bit more then just validating characters, also stripping HTML tags. strip_tags
would work for this indeed (do it before the replace).
This solution may work for you if you only want to keep upper and lower case alpha characters in either french or english:
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
<?php
$str="Conférence ministérielle sur la francophonie canadienne - Éamonn Mac Lochlainn";
echo preg_replace("/[^a-zA-ZÀ-ÿ ]/",'',$str);
?>
</body>
</html>
The echo'd response is:
Conférence ministérielle sur la francophonie canadienne Éamonn Mac Lochlainn
精彩评论